Getting the Data

I briefly mentioned that my local paper stopped providing articles without requiring a registration. It’s a free registration (at least at this point), but I find it annoying. For weeks, I made do without. Then I noticed after searching that an article listed as Charlotte was showing up on a completely different domain. Turns out it is the same company – just another way of getting at the data without registering. So I’m back.

For years, I’ve known that the Microsoft search engine is broken. I can often find articles – at Microsoft – faster through Google than I can find them through Microsoft’s own engine. Explain that.

I also stopped visiting Yahoo! news a while back. It’s not that they require a login – I don’t think they do – but the articles disappear before long. Using them as reference points just doesn’t work. To be fair, I don’t think Yahoo! archives the data in order to charge extra fees to get to it later – I think they just purge their articles regularly. But still, if I’m going to link to something, I’d like the link to be there when someone comes by to read it a week (month, year) later. So I don’t use Yahoo! news either.

Now, Wired talks about how the New York Times hides their data behind a registration page, and then they make it unavailable after one week online. I personally don’t read articles at the New York Times. If I see that it’s their article, I skip it. I could register, but I don’t. It’s just not worth it to me, and I’ll find it elsewhere or I’ll do without.

I don’t know the answer – these companies are in it for money, and if the Times thinks this is the best way to do that, that is certainly their prerogative – but I can’t help but think that there has to be a better option out there than the roads that these companies are taking.


Posted

in

Comments

4 responses to “Getting the Data”

  1. Chad Everett Avatar

    The other problem with sites such as Bugmenot is that they tend to go away.

  2. Chad Everett Avatar

    As to Bugmenot or other proxy registration services, I would just rather not. As Wired points out in another article, it’s not so much an issue of the difficulty. It’s just that I don’t want to bother with the hassle. If it becomes more work for me to get at the data I want, then I’ll do so – if the data is worthwhile. If I can shift my habits slightly and get at the same data, or even improved data, without the whole mess of registration or installing plugins? I’m all over it.

  3. Chad Everett Avatar

    Good point: That article link you provided does seem to work, some weeks later.

    I think the (Wired) article is geared towards regular old newspaper content data, that someone might access while browsing the web. Generally speaking, dynamic data such as this particular example doesn’t seem to get indexed by Google – making it somewhat useless as an archival tool.

    If you happen across the article through Google (it can be done – try this link), you’ll be given a URL similar to the one you posted, but it doesn’t have the query information following it. When you go to view it, you can view only the beginning, and are required to purchase it to view the rest.

    Apparently, as you point out, the article is available in its entirety online – you just have to know the URL to get to it. I tried a few combinations of queries against other URLs, and the data in that link doesn’t necessarily seem to work for others, so that may not be the best way at it.

    So it seems that the point of the article is still the same: Go to Google, you don’t get many NY Times articles. When you do, you may have to pay for the whole thing.

    Now, if you happen to have an RSS feed you’ve been browsing, or perhaps a source for aggregating the content of the feeds, it may be another story entirely… (for what it’s worth, I couldn’t find the link you posted through Bloglines).

  4. P6 Avatar

    Wired doesn’t have it quite right. NY Times links in their RSS feeds don’t expire. (That’s a random link, the oldest still in my rss reader)