DailyDirt: Big Data Isn't Necessarily Better
from the urls-we-dig-up dept
The old — Garbage In, Garbage Out — GIGO principle originated during the early days of computing, but it may be even more applicable today. With the explosion of data available that can be collected, there’s a temptation to assume that analyses and meta-analyses can make sense of all that data and produce incredible insights. However, we should probably have some skepticism before we jump into the deep end of data and expect miraculous results.
- Microsoft researchers report that they think they can diagnose internet users with pancreatic cancer — just by analyzing a large number of search requests. It’s not clear what can be done with this research since the data was anonymized (and therefore no one can be contacted), and if users know their searches are being monitored for serious health issues — will they continue to search using search engines that might creepily diagnose them? [url]
- Google Flu Trends attempted to predict flu seasons based on people’s searches, but it didn’t end up doing such a great job. Future versions of this project could be better, but the initial success of predicting flu trends suffered acutely from “big data hubris” — a treatable (hopefully) ailment. [url]
- A widely-told parable warning of the mistakes using neural nets tells how the US Army once tried to train software to detect camouflaged tanks from various images. The complex software didn’t learn how to detect tanks at all, but instead focused on the clouds that the algorithms determined correlated well with tanks. Oops. Who wants to trust AI to make life-or-death decisions if we can’t understand what the machines are thinking? [url]
After you’ve finished checking out those links, take a look at our Daily Deals for cool gadgets and other awesome stuff.
Filed Under: ai, artificial intelligence, big data, big data hubris, cancer, ed fredkin, flu trends, gigo
Companies: google, microsoft
Comments on “DailyDirt: Big Data Isn't Necessarily Better”
Big Data
Is a short cut term for some managers wet dream that there exists some magic genie where if they could just know something that they could do something to really rake in the dough and curb stop the fuck out of competitors.
Now this is all fine and well but there is another terrible, very terrible side effect to this that a lot of people do not consider. Just like in Science, a researcher will actually create a setup where the data actually points in the direction they want to be true to begin with. You just cannot breed unconscious bias out of a manger any more than you can a baby.
If Big Data does not show management what they want to hear then it’s back to the drawing board, or worse, Big Data is only used to research topics they care about while ignoring subjects they “feel” have already been addressed, because they are afraid that big data might bite them in the ass and reveal just how terrible they are at making decisions… well just like very other fucking person on the planet, it’s just their ego’s will not allow much of it.
AI for Driverless Cars
I personally, can’t wait for all those millions of driverless cars to hit the road.
I know of at least 10 different areas within 25 miles of my house where the GPS maps are HIGHLY incorrect, and are DETERMINED to have you ‘turn here’ (right into the lake, the river, or the condemned 5-story building).
Yeah, who ‘ya gonna sue, when your driverless car drowns your family in the local lake? I’ll bet there’s an app for that, too (or a detractor that will be written into law, saying you can’t sue the map company, the car company, or the electronics company for their untimely death).
pancreatic cancer
From the original article:
“the researchers declined to offer specific details”
Of course. Why save lives when you can keep a proprietary algorithm proprietary?