oskar on Sun, 13 Jul 2003 18:41:40 +0200 (CEST) |
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
Re: <nettime> googolo-structural digest [sheetz, douwe] |
Hi > > but the point -- that google's technology is a political one -- holds > > because its algorithm encodes the political structure of popularity... > > An interesting idea. What are the other possible search engine political > structures. The two most succesfull approaches right now are: > > The Google model. The distinguishing feature of Google is that the role > link popularity plays. This is independent of the actual search term (as far > as we know), ie a page has a certain Google Rank and that helps the page, no > matter what the user is searching for. > This is not as much a populistic structure, but more a technocratic. It is > not the popularity among searchers that determines success, but the > popularity among website builders/bloggers/corporations, what have you. I personally love Google; I almost never use any other search engine. However, I believe that it's going to have to undergo changes fairly shortly. It's very strength, using pagerank, will become it's greatest weakness. Google's ranking system is, unfortunately, a feedback system. The problem is that this feedback system seems to lack adequate controls to stop positive feedback. There may be some magic factor in the pagerank calculation to try and damp this effect, but I've not personally seen mention of it. As you say, the feedback system is flawed because the people building web sites are the people that are "voting" on the page. But I see that there's another fairly large problem with this approach. Let's say that I've recently become interested in the pagerank system. I do a Google search, and work through the first pages of results, and find lots of interesting things. I then write up a web page, and put a discussion of pagerank on my web site. I also include a selection of the most interesting links I've found on the page. So: what just happened? When Google looks at my page, it finds links to the pages that it already has at the top of the index. This heightens the ranking of those pages. Every time someone uses Google to find links they can include on their page, they increase the ranking of the top ranked pages. And this happens all the time. Unless I spend a very long time looking around the net, trying to find useful links that are not "top rated", then all I ever do is increase the ranking of the already top-rated pages. This has huge social ramifications. If you're top-ranked now, it's going to become harder and harder to be displaced from your top ranking. The longer Google remains the search engine of choice, the more limited people's access to alternative viewpoints will get. Further, people with a higher search ranking will take their sites and turn them into commercial entities, in the hope of making money from their sites. Slowly but surely, the top ranked sites will become more and more commercial. Since there are a huge number of "stagnant" pages on the net, which are never updated, it will be even more difficult to remove these top-ranked pages from the elite "first page results on Google" list. In fact, is this not already happening? One of the most interesting things that I've seen in the last few years is that the net seems to have become less and less free, and pushes the boundries of "traditional society" less and less. If this happens, it's almost never in the traditional web space, it's in blogs, email, other places. Perhaps this is because the top rated sites are now so firmly entrenched at the top of the lists. Alternative viewpoints end up appearing on page 8 or 9, and since nobody ever sees them, nobody ever links to them, and they never progress up the list. Perhaps Google's already influencing our world view more than we expect. The primary way of getting out of this feedback loop seems to be seperation of the measurement system from the feedback system. I find it's exceptionally interesting that the blogging system is causing "disturbances in the force" of pagerank. Blogging, email, and other such things are quite fundamentally different from the web. They are time based, off the cuff, and there is a lot of internal cross referencing and meme transfer. Blogging and email more rarely relies on the results of the search engines for it's content: it's based on real world experience, and largely on more human emotions (ie: it often represents an alternative point of view to the norm). If there's a link to a website it's often because it's related to something that people have not seen before, or to an alternative point of view. I see very few blogs going "Hey wow, check this cool site: www.amazon.com". They almost always point to alternative viewpoints or current events. There's a lot of competition be the "the first" that mentions some cool page. People still link to cnn, of course, but they normally do it to some specific article at CNN that's pertinent to the day. Perhaps the time aspect of blogging could fundamentally change things. Blogs log the passage of time, more than the traditional web does. Perhaps people could deteriorate the importance of links as they got older and less relevant. There are various other ways that you could collect feedback; if you imposed on everyone's email and found urls, you'd have a much more "intimate" ranking of pages... pages that people wouldn't want to put on their home page with pictures of their cat. There is still a feedback system, of course, since they will often search for a page and then refer it to a friend, but it seems less direct. I normally email people hard-to-find/interesting links.. stuff that the first hit doesn't find, and which uses interesting search criteria. I am, of course, not your average net user. There are a variety of other ways that this situation could be improved; some that spring immediately to mind are: 1) Have some sort of exponential reduction of popularity; thus pages that are very highly rated by links are penalised for being so highly ranked, and may be overtaken by other pages that actually have a lower number of links. 2) Randomly insert some lower ranked pages into the first page. 3) When you put a link on a page, include a tag that indicates where you got it from. Page indexers then read that tag, say "ah, the author got this tag from my index - reduce my index value by some appropriate value". Obviously there are problems with this approach, since it means updating every single link on the net :) 4) Try and reduce the impact of older pages. If a page hasn't been updated for a very long time, it's likely that the author hasn't looked at the links that they are referencing. Thus you could ignore these in the ranking, or even penalise the destination page. Similar to the blog thing above. Oskar Pearson # distributed via <nettime>: no commercial use without permission # <nettime> is a moderated mailing list for net criticism, # collaborative text filtering and cultural politics of the nets # more info: [email protected] and "info nettime-l" in the msg body # archive: http://www.nettime.org contact: [email protected]