Content 2.0: The Future Of Web Search
Yahoo's Vice President of Product Strategy Bradley Horowitz explored the issues around search and shared his vision of social search and its potential with the audience at Content 2.0 on 6th June 2006...
KEYNOTE: THE CHANGING FACE OF WEB SEARCH
Yahoo’s Vice President of Product Strategy Bradley
Horowitz explored the issues around search and shared his vision
of social search and its potential with the audience at Content
2.0 on 6th June 2006...
Report by Deirdre Molloy
[Register and post your own comments
on this article below...]
Download this session from the Content 2.0
Podcasts!
“I’m just a remixer” was Bradley’s opening gambit, “I’m taking a
lot ideas that are floating around Yahoo and floating around my
group”. He referenced Jamie Kantrowitz’s earlier comment that
Yahoo! are a Web 1.0 company re-tooling for Web 2.0. There’s a
lot of truth to that, Bradley noted, and it’s both a blessing
and a curse.
They’re blessed with the largest internet audience out there,
over half a billion people come to Yahoo! every month. But when
you think about something like Yahoo Mail and the throughput and
number of transactions they get on that and then you think about
changing that infrastructure to catch up with Web 2.0 – that’s a
daunting task and they’re faced with that at every turn.
We’re committed to opening up, Bradley stressed, it’s not just a
small bands of pirates running around trying to convince the
rest of Yahoo! to open – it’s the seasoned management of the
company. The founders, the senior level executives are mandating
that Yahoo! needs to open up. We can see the writing on the
wall, Bradley reasoned. We understand we need to do this, not
out of some altruistic generosity, but to become and remain a
viable business, he emphasised. Some of the people Bradley
wanted to credit who’s work and thoughts he was remixing
included: Tom Coates, Paul
Hammond, Simon Willison, Danah Boyd, Jeremy
Zawodny (Yahoo’s Robert Scoble), Caterina
Fake, Chad Dickerson, and others.
"We’re committed to opening up... we can see the writing on the wall."The title of his talk was now changed from 'The Changing Face Of Web Search' to ‘Better Search Through People’, Bradley explained, adding that there’s no trademark on this title ;-) and he proceeded to outline their strategy for social search.
- Bradley Horowitz
He showed a pyramid diagram of community dynamics, such as found in Yahoo Groups – 1% creators, 10% synthesizers and 100% consumers lurking or somehow benefiting from the top 11%’s activities. You can create viable strategies around this model. Yahoo are trying to change this community dynamic so it becomes 100% creators and synthesizers.
In order to achieve this first they need to lower the barriers for entry to participation. Yahoo’s Launchcast radio product does this very well, Bradley reckoned, it gets better and better at predicting the music I will like, and I do this for a selfish motive. But I can also, with a simple tick-box, publish my own radio station as an artefact to share with others. So the act of consumption is also implicitly an act of production. I didn’t have to do anything else to produce and author that content – it was something that was created in the wake of my consumption. That’s the kind of magic they’re looking for, he said.
He then went on to look at “surfacing great content” and played a sequence of some of the top 100 photos rated for “interestingness” on Flickr – they are taken by amateurs but have a powerful and professional impact. This is the best, he reckoned, of what culture has to offer. And it’s also user-discovered content.
When they discovered Flickr is was pointed out to them that Yahoo! already has the world’s biggest and most successful photo site - Yahoo Photos - so why were they sniffing around this small Canadian company of just 10 employees? He highlighted 4 things that make Flickr special.
(1) The content is wonderful and all user-generated
(Yahoo! have been in user-generated content for nearly a decade so it’s well understood). In digression Bradley explained that previously his speciality had been computer vision at MIT which while exciting, is very difficult. It’s hard to throw a bunch of pixels at an algorithm and have it bring back a meaningful assessment of what’s going on in the image, and it’s going to stay hard for a really long time. But, good news, people do this really well, and there’s 6 billion of them – and that’s the magic of Flickr. He cited the ESP game built by Louis Van An from Carnegie Mellon University, which, like Tetris, you don’t want to quit. It’s for people who like word games, and the entire game is a ruse to exchange labour for fun, they’ve collected 10 million tags on images ion the internet and these tags are double-blind quality assured2 people who can’t communicate agreed on the tag – its really clever and very subtly embedded in the incentive system of the game. When he saw that, a bell went off in his head.
(2) User organised content
Content is tagged, described, organised and discovered not by editors but by the users themselves. There is no rules, and by lowering the barriers to entry and allowing tagging, 85% of the images now have human-added data. There’s no spell check, you can make up words and input whatever makes sense to you.
(3) The users themselves picked up Flickr and distributed it around the internet
It integrates with popular blogging software. The terms of service are such that if you click on a photo on a blog, it takes you back to the Flickr site. It wasn’t business development [or marketing – Ed] but the community that grew and distributed Flickr. People began to use Flickr as the underlying imaging infrastructure of their blogs. So Flickr began to appear on literally 10s of thousands of sites without any business development guy doing anything, the people did if for them. The community distributed Flickr.
(4) User developed community – and user developed functionality
The entire ecosystem was created by less than 10 employees – aided by millions in the Flickr community. Flickr built Flickr as a platform not just an end-user destination, so they considered developers from the very inception of the product, and wrote bindings for all the popular scripting languages – PHP, Python, C++, Java – and let the developer community go at it. And literally thousands of developers have built value against the Flickr API, some very useful and utilitarian like the Macintosh Uploader. Again you couldn’t build that on Yahoo! Photos. It’s just amazing the kinds of possibilities that are unlocked when you have this huge archive to timely, relevant, interested content, all richly annotated – you can build cool things. He pointed us to www.flickr.com/services where a bunch of them are featured.
Returning to the idea of lowering barriers to participation, Bradley explained that they’ve biased things to let people do what they’re really good at, and use the algorithms for their own purposes. They launched “interestingness” – which is based on a heuristic of how often a photo is viewed, commented, blogged, favourited – so harvesting people’s behaviour to provide additional value. And it was done by the community not by editorial staff, and in a way that is implicit not explicit (that would have lead to all kinds of gaming of the system and veered the product off in the wrong direction). It reflects natural activity around the photos.
Bradley explained that when they introduced tagging the librarians among them said it would never work. But the “query disambiguation” system applies - whereby Flickr looks at the co-occurrence of terms and breaks them down into their constituent clusters. There is cluster-based analysis on any tag and the clusters re-orientate in real time. That’s part of whet they mean by better search through people. The data decides how these clusters unfold.
"The difference between information and knowledge is really that human factor and that’s what we’re all about."The corporate vision of Yahoo! is to enrich people’s lives by enabling them to find, use, share and expand (FUSE) all human knowledge [as I heard Yahoo European Director of New Business Rob Jonas articulate at the 25th Jan ALPSP event in London on book digitization explain in more detail – Ed] and they always ask at Yahoo! have you fused your product? It’s a very different mission and vision from their competitors, he noted, the words human and knowledge are in their vision – the difference between information and knowledge is really that human factor and that’s what they’re all about.
- Bradley Horowitz
A potted history from Bradley followed: Yahoo was created by two guys who dropped out of college and went to form a company that organised the web for the benefit for the rest of us. This scaled until 1995, the net exploded, technologies like Lycos and Alta Vista etc arrived on the scene and bots started crawling the web and people (ie. Webmasters) started gaming the search system. Yahoo’s approach was editorial and authoritative, but it didn’t scale. The next major innovation came when Google introduced Page Rank – a link-based analysis of a site’s relevance in search queries. People do try to spoof this system using giant link farms and other workarounds, Bradley admitted, but it’s not easy.
Search still isn’t solved, but there are big issues with Page Rank. If you think about query composition, about 25% is informational, 40% is navigational (people are still using the search box as a steering wheel to get around), and transactional queries are less than 35% of all search queries, but transactional is the most valuable, Bradley observed. It’s the transactional ones that aren’t well served by today’s technology as they’re subjective – do you know reputable plumber in London, what political blogs are good? - and ironically they’re the most valuable class of queries.
"Social search is about democratizing the process and saying why can’t regular people contribute to that voting process?"Page Rank is conferring upon webmasters the privilege of deciding what’s right for all of us and whether you or I do a query on IBM, we all get the votes that the webmasters have cast by proxy for us and we get the same results. Social search is about democratizing the process and saying why can’t regular people contribute to that voting process, and further, where you can ask people whom you trust what they recommend?
- Bradley Horowitz
Social search is about democratizing the voting process and taking it away from webmasters. Their new service Yahoo Answers allows you to get at information that doesn’t yet exist online. The premise behind answers is that you can ask It relies on community moderation and is a community marketplace.
The product actually came out of Korea where they had an interesting problem – it has a small but highly wired population, and a couple of years ago they had a problem in that there was very few articles in Korean on the web. So Naver created a platform called KnowledgeIn and now they are the dominant player and Yahoo! and Google are nowhere. They did it because they were able to establish this knowledge marketplace and capture knowledge and mindshare. Yahoo! copied that idea and launched in Taiwan and the USA where it’s been very successful. They introduced Yahoo Answers gradually – through the Flickr audience initially – and began to snowball it, and now they have search integration so it percolates through search queries.
"Delicious unearths buzz and zeitgeist and Yahoo! can use that as a tool to improve navigation and discoverability across all our products"Then he turned to social bookmarking site Delicious, which sort of represents the top of the pyramid, he observed. They have a small decentralized audience. It unearths buzz and zeitgeist and Yahoo! can use that as a tool to improve navigation and discoverability across all their products. Another Yahoo! product is My Web which allows you to search, for instance, for London hotels and see which ones my friends stayed in, in addition to the millions of organic results, so it percolates the subjective into the search process.
- Bradley Horowitz
With time running out he took two questions from the audience, the first of which was: why is it taking Yahoo so long to monetise user-generated content? In regards to Flickr, ads are displayed on the pages of all non-Pro account holders and Bradley explained that in terms of the printing process they don’t have a UK or any international printers external to the USA yet, which is largely a function of the non-internationalised third-party suppliers they use.
BT Group Director of Web Services Sam Sethi asked, given how well they’d integrated microformats, when was Yahoo! going to buy Technorati? Bradley said they had no plans to but he was very supportive of their work in the area of microformats, and that geo-tagging is also something that would be good within Flickr and something else to look forward to.
Content 2.0 - 2006 conference Website:
http://www.content2point0.com/2006/
About Bradley Horowitz:
Bradley Horowitz, Vice President of Product Strategy, is responsible for leading Yahoo!’s efforts in building innovative search technologies. Bradley’s expertise helps drive initiatives that enable the company to provide comprehensive and compelling offerings to customers. Previously he managed a portfolio of products for Yahoo!, including media search, desktop search and the Yahoo! Toolbar. Prior to joining Yahoo!, Bradley served as both the chief technical officer and the vice president of engineering for the Virage division of Autonomy, where he was responsible for the technical delivery of five major product lines. Prior to Autonomy, he founded Virage, the company widely recognized as the market creator and leader for advanced media indexing and analysis. He helped grow the company from “a garage startup” through its NASDAQ IPO. Bradley was a PhD candidate at the MIT Media Lab. While at the Media Lab, he worked on a number of topics related to computer vision, graphics and image processing, which resulted in a patented new technique for the recovery of structure, motion and camera parameters from video sequences. Bradley holds a MS in Media Science from MIT and a BS in Computer Science from the University of Michigan. He blogs at http://www.elatable.com/blog/
OTHER CONTENT 2.0 SESSION REPORTS
Content 2.0: Mesh Up - Connecting Content To People
Content 2.0: Goodbye New Media Hello Social Media
Content 2.0: Marketing 2.0 Forum
Content 2.0: Can Brands Be Trusted?
Content 2.0: Folksonomies - What Are They Good For?
Content 2.0: Search & Enjoy Forum
Content 2.0: The Invisible Culture
Beers & Innovation (music special) @ Content 2.0
StumbleUpon
Comments
You must be logged in to comment.