Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Corporations will abuse your personal integrity whenever they get a chance, while abiding the law. Corporations will cry like babies when their publicly available data (their livelyhood) gets scraped. They will take you to court.

They consider their data to be theirs, even though they published it on the internet. They consider your data (your personal integrity) to be theirs as well, because how can you assume personal integrity when you are surfing the internet?

I have high hopes that the judicial system some time not too far from now will realize that since the law should be a reflection of the current moral standings it will always be behind, trying to catch up with us and that those who break the law while not breaking the current moral standings are still "good citizens" unworthy of prison or fines.

I guess Google won this iteration of the internet because of the double-standars site owners stand by, to allow Google to scrape anything while hindering any competitors from doing the same. There will only be a true competitor to Google when we in the next iteration of the internet realize that searching vast amounts of data (the internet) is a solved problem, that anyone can do as good a job as Google, and move on to the next quirk, around wich there will be competition, and in the end that quirk will be solved, we'll have a winner, signaling that is it time to move on to the next iteration.



> Corporations will abuse your personal integrity whenever they get a chance, while abiding the law.

Call my cynical if you will, but I'd leave "while abiding the law" out of that, or at least replace it with "while hoping they aren't breaking the law". Due diligence on these matters is often sadly lacking. They'll take the information first and only consider any such implications when/if they come up later.

Large organisations like Google probably will make the up-front effort to remain legal, because they are in the public eye enough for lack of doing so to attract a lot of unwanted press, but you don't have to get a lot smaller than that to start finding companies who are a lot less careful (or in some cases wilfully negligent).


I would use Microsoft as a precedent. Sure they will attempt to stay legal but by pushing it as far as they can.

For instance the browser choice script that came with Windows imposed by the EU never worked. It was a "bug". Somehow they must have omitted to test the feature...

Until last year Microsoft started playing nice, and I think Google and Facebook have become the new corporate villains. But recently the Windows team seems to be minded to challenge them in that position.


Often, it's indeed cheaper to pay a government-mandated fine than lose market opportunities afforded by behavior that later runs afoul of some law or regulation.


The difference is that Google didn't agree to not scrape your data. You, as per their TOS, agreed not to scrape theirs, as part of the condition of using their service.


Which TOS?

I might have accepted terms when I created a Google Account but in no way do I agree to a TOS by visiting a URL.


To see the terms that Google thinks you have agreed to, click 'Terms' at the bottom of www.google.com

If that doesn't hold up in court, in future on your first visit to Google it will simply display some text and require that you click 'I agree' to continue.

Either way, it seems reasonable to me that you should agree to their terms in order to use their service.


So if instead I scrape their site (like they are scraping others) I don't have any opportunity to agree to their terms? Much like their scrapers on other sites?

I'm honestly wondering about the double standard. There is a rational way to discuss morality/ethics and subsequent laws regarding most technical aspects, that often mirrors real world (read: offline/analog) scenarios. It's unfortunate that the legal system has instead been appropriated by lawyers.


>> It's unfortunate that the legal system has instead been appropriated by lawyers.

omg, really?

It's unfortunate that the internet has instead been appropriated by hackers. It's unfortunate that the stock market has instead been appropriated by traders. It's unfortunate that the asylum has instead been appropriated by inmates.


To some extent, yes. When people spend enough time in their given field to know the ins and outs, those less scrupulous tend to bend the rules more and more. While not _strictly_ against the rules it often ends up going against the spirit at the base of the industry.

Very few traders went to jail after 2008. Seemingly legal (or at least not illegal). Should they have? Most bright/talented lawyers are likely working (again within the law) to get megacorps or rich people off for something poorer people would not. In our field this OP is one of the issues. What information is free and what information is not? What things I'm allowed to do offline am I allowed to do online?

I'm not proposing a solution, but any system populated by humans will be abused by some, and fought for by some idealists, all within that systems rules.

Let's take murder: I stab someone: murder. I use a broom to push a flower pot off a balcony hitting someone in the head, killing them: murder. I swat a butterfly in Beijing, causing a chain of events to a container crushing a dock worker in Rotterdam. Murder? If this extreme example comes down to intent it's thought crime, otherwise I'm playing within the rules of the system, and I just killed someone, scot-free.

While there apparently were no laws prohibiting the upsale of bad mortgages, and banks having the resources to move the market towards more and worse mortgages, that also was within the systems rules, but I personally think it's far beyond the intended use of that market, and well outside the spirit of the laws.

There's a huge difference between judicial justice and what most would agree was "justice". That's where my first comment came in. True about most systems.


Don't leave us hanging--did the butterfly make it??


Yeah, really. Law is not an end in itself, it's meant to serve a purpose instead. When people, whose jobs is instrumental to the goal, start deciding about what the goal is, bad things happen. The same is with MBAs and businesses.


There's no double standard. In the case of crawling and scraping their site the terms are available in the robots.txt file. And Google abide by the robots.txt terms of other websites.

I'm not sure why you dislike this 'appropriated by lawyers' outcome: For web crawling look at robots.txt, for other uses look at the Terms link on the homepage. If you don't agree to the terms then stop accessing the website. Seems straightforward and fair to me.


Yeah, you're right in response to my comment. It was a bad example. But while google.com (for example) has a robots.txt, you could argue that it's not exactly fair nor inviting disruption. For example whitelisting twitter and facebook for images (subsequently blacklisting everything else). While I won't cry too much foul, I get the feeling that Google entered the stage when internet was quite a bit more wild west (for good and bad) and then the internet changed, partly by them and partly by other actors. For at least some markets I believe it's almost impossible to get a footing now as a new actor, as it's only available to (what is basically) cartels. Email being another one, as you can be locked out of gmail.com or outlook.com communication with basically no discourse if you run your own email server.


The TOS that Google follows is published in the robots.txt file. If you don't want Google to scrape your site, then that's all you need. There's no double standard.


I'm sure that's true for your average Wordpress publisher, but the big guys will either slap you with a law suit or take other measures to make you stop crawling their site.

Scraping and crawling is the same thing btw. I absolutely love how the English language has several words for the same thing. Your language very expressive.

Google is a scraper. Your data will end up in their index. You are perfectly OK with Google "stealing" your data.

A new player crawling your site is an offence to you. How dare someone other than Google or Bing put preasure on my site? How dare they steal my data?

TOS is a joke.

I wonder, what was the intention of the founding fathers of the internet, of the internet? Was it not to make data publicly available?


> If that doesn't hold up in court, in future on your first visit to Google it will simply display some text and require that you click 'I agree' to continue.

This statement is demonstrably false, as shown by all the places in the world where this type of TOS-nonsense actually does not hold up in court.

And in the USA, it's (as usual) even slightly more absurd: The only reason it does hold up in court is because Google can afford justice.


Try using google from a fresh install, they´ll force you to accept their TOS.


Are they A/B testing this or is acceptance IP-based? I reinstalled recently and I didn't see it. Firefox in private navigation mode also lets me use it without forcing me to agree with anything.


Lucky you. I get their stupid modal overlay more often than I'm happy with. On top of that it now usually defaults to Dutch and Dutch results even when I don't want this. Highly annoying.


I'm under the impression that simply having a visible legal notice like "By visiting this page you agree to our ToS" is enough to bypass that.


Not in the EU, you have to explicitly and manually agree to them.


Regarding the scraping and the legality of it all. I wonder if it's still illegal if you respect the robots.txt and other meta elements in html standards.

If Google's actions were illegal, I'm sure that they would have been sued even if their scraping and indexing usually is helpful for the website owner




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: