graph theory

I think I just had a wonderful idea.

This journalist has a machine that sifts through lots of raw data and writes a story — using a style tuned to his specifications — which I then read. I can keep reading his stuff or look elsewhere, but if that feedback reaches him at all, it must reach through several layers.

What if I had a machine like that, which sifted through the same data and wrote stories suited to my taste. I could make minute adjustments whenever I pleased, or read articles by multiple “journalists” on the same subject, give my scores and let them fight it out and evolve. The human journalists could still do the research, but I’d be subscribing to the pool of their findings, not to the condensate articles. And I could gin up as much of this bespoke news as I wanted, on any topic I wanted…

What if there were a complementary engine that could read a story (with a certain date), extract the constituent facts, and deduce the settings the robo-journalist should have in order to produce something similar given the facts known at the time. Then one could reverse the editorial bias settings and read an opposing view. Or study a large body of articles to make a robo-journalist mimic of any human journalist, living or dead. (Maybe not a very convincing one, but the technology can only improve.) Ah, to read Hitchens again…

Such robo-journalists could have blogs, singly or in groups, and evolve to…

Yeah, this has potential.

DOlz

April 3, 2013 at 12:36 am

Big deal

This blog already has taken the next step. There are several bots that scan the articles here then generate random outrage based on keywords. Mind you the results aren’t perfect yet and can be fairly hilarious at times due a lack of reading comprehension.

vegetaman (profile)

April 3, 2013 at 4:54 am

Re: Big deal

Unfortunately the bots seemed to be programmed WITH random capitalization AND bad engrish.

DOlz

April 3, 2013 at 10:13 am

Re: Re: Big deal

That’s to make them seem more human. Like the bot that wrote my post wrote, “due a lack”, intend of “due to a lack”.

Mike we really (at least I do) need an edit feature for our posts.

special-interesting (profile)

April 3, 2013 at 2:10 am

Its hard to think of anything less copyrightable than news. Gossip, grape-vine talk, heard it from a friend, over the fence backyard chat and other cultural ways to share information particularly news plus our opinions about it. All of this is foundation culture in the way we like to talk about the weather.

Its nice to have automation to deal with crunching data from diverse sources combining that into readable reports but this also allows the user to do the same or more analysis given the same data. In this case the data would be more important than the article itself.

Many times have I wanted to create own stock market value vs. points in economic history that I thought were pivotal complete with interest rate co analysis and other real time based reports/values plotted along side and the ability to update it as new data arrives. How about prison population vs. GDP economic drain on society or maybe crop production vs. global avg temp. you get the idea.

Not many of our personal analysis would be perfect but at least we were looking/checking on our own. It might be nice not having to believe some pencil-neck pinhead hired by a special interest group to grind an axe.

In the US data in not patentable or copyrightable but a published printout can be even though the facts in it cannot. So. We have the start of the data wars.

To keep one computer from taking/using the data from another the first algorithm would not give the data but some multiplier of two data values for some new industry value not rated or known to the general public. It gives new meaning to the term data processing. Such wold prevent the average person from making an original analysis.

Its already a common technique when the market has learned to measure the value of a product by standard industry values. Then through special interest industry groups doing market research find some new way to measure the product that makes smaller units seem larger thus willing to pay more. (Diagonal TV measurement instead of X and Y values, VA multiplier for UPS computer power supplies instead of watts etc.)

As if what is heard on popular media news sites is not suspect enough how would knowing that it was manipulated by robots be any better? At least with news anchors we could blame a person but how can one fault an algorithm? Have started to mistrust any article/research/experiment/causation/idea that does not provide original data sources.

Anonymous Coward

April 3, 2013 at 2:26 am

The next step is consumer side customization.
I wake up to a morning newspaper customized to MY tastes, data taken from various sources on the Internet and the stories chosen for me based on my interests, edited for me based on my reading habits, written based on my bias.

jjmsan (profile)

April 3, 2013 at 9:19 am

Re: Re:

I bought both my children and grandchildren dictionaries. The reason for this is that when I was a child, I learned quite a bit from words I found on the way to words I was actually trying to find. If you can customize everything you can see you won’t find things you don’t know. The sad thing is you won’t even know you missed them.

Anonymous Coward

April 3, 2013 at 4:48 am

Most of what you see and hear on network news these days is advertising and editorial jocularity. The talking heads might as well be replaced with maxheadroom. Real journalism still exists, but one needs to look for it. I doubt a robot will be performing the task anytime soon, unless possibly, it is Marvin.

Anonymous Coward

April 3, 2013 at 6:22 am

Spit Take

“*Yes, I realize lasers don’t make noise or “zoom” by, but that hasn’t prevented George Lucas from becoming insanely rich, has it? “

My keyboard thanks you for its coffee.

Andrew D. Todd (user link)

April 3, 2013 at 7:00 am

Bad Newspaper Writing.

People have come up with programs which can take tabular information (stock market results, baseball scores, etc.) and plug the results into a prose template. It’s not really intelligent, of course, but it has merely exposed the banality of much newspaper writing.

http://news.slashdot.org/story/12/05/12/1233224/could-a-computer-write-this-story

Certain news stories, notably those in sports, are highly formulaic. I had a look at some sports stories in the local newspaper, which were written by humans. What turns out is that sportswriters aren’t even very good at articulating what actually happened in a baseball or football game, so they pad it out with statistics, standard cliches, and summary reports of previous games. For example: “Smith, whose batting average is such and such, struck out, He had also struck out in yesterday’s game, and in Wednesday’s game.” The sportswriter is padding out minimal information about a game which he may not even have attended. Obviously, a computer can paste in bits and pieces from old articles, that kind of thing. As I said, this is really bad sports-writing. It does not tell you something interesting, for example, that Smith has superb reflexes but he swings at a lot of balls which are not over the plate, and if he only held back, the umpire would call them foul. Perhaps Smith is a simple sort of man, easily baited in general, and perhaps Jones, the pitcher, is a clever, mocking sort of fellow, like the boxer Muhammad Ali (“If he gives me any jive, I’ll take him in five!”), who is very good at needling simple men into indiscretions, the way a matador plays a bull. Perhaps Jones is carefully placing the ball just far enough beyond the plate that Smith will swing at it and miss, and become still more enraged at missing…

The kind of paper a baseball umpire might maintain for his own use might look a lot like a spreadsheet, a pre-printed sheet with room to fill in information as it develops. He can tick a box to indicate a strike, a foul ball, a home run, etc., and work up a kind of shorthand similar to that used to indicate chess moves. “1 KP-K4, KP-K4; 2 QP-Q4, PxP; 3 QxP, QN-QB3; 4 Q-K3, N-B3” is quite expressive in its way. One could have an analogous system for baseball. There might be a stylized diagram of the playing field, on which movement of the ball and the runners could be indicated by drawing arrows. A tablet computer, with the right software, could save time and labor in recording this diagrammatic information, and automatically generate an Acrobat file for publication. In the case of football, there is a conventional language of diagrams, which is used to show players how they are supposed to move when a given play is called. It can equally well be used post-hoc, to describe how they actually did move.

Of course, the umpire could add idiosyncratic sidenotes, where appropriate, eg. “Pitcher threw ball, which struck batter in ear. Batter picked himself up, gathered up his bat, ran to mound, hit pitcher over head with bat. Outfielders ran towards mound, as did players from the bench, the latter carrying additional bats. Faction fight ensued. Opposing fans joined in, carrying automobile tire irons. Game canceled. Police called.”

If you are looking at _good_ sports-writing, say Ernest Hemingway’s _Death in the Afternoon_, James Michener’s _Sports in America_, or Donald Hall’s _Fathers and Sons Playing Catch_, that is not something a computer can easily replicate. But they talk about much bigger subjects than merely who won the game.

Mason Wheeler (profile)

April 3, 2013 at 8:19 am

Of course there's a simple answer

This leads to the ethical quandary presented by the use of bots. Is robo-generated journalism really journalism, and is the use of algorithms a betrayal of readers’ trust, especially when a familiar name is on the byline? If factual errors are discovered, does the blame lie with the software, or with the journalist who agreed to let the article “write itself?”

The answer here isn’t simple (and the question likely isn’t even fully formed yet), but the key is transparency.

The answer is very simple: If the reporter put his name on it, he is accountable for it. If his algorithm screws something up and he lets it go to print without even checking it first, then of course the blame lies with him. Why wouldn’t it?

Anonymous Coward

April 3, 2013 at 8:43 am

Is warmed by your welcome: http://redhat.com

Andrew D. Todd (user link)

April 3, 2013 at 9:34 am

Present Data As Data Without Trying to Verbalize It Unnecessarily.

Another point is that data should be presented as data. For example, weather reporting properly consists of a link to the National Oceanic and Atmospheric Administration site.

http://www.weather.gov/

Weather is not local– it involves huge masses of air, moving over hundreds or thousands of miles. The NOAA electronically collects data from all over the world, and from space, dumps it all into the computer, sets it up as Partial Differential Equations, and generates worldwide predictions. Only the government has the money to do this kind of thing. By inputing a latitude and longitude, you can get a forecast for any particular location. A newspaper’s attempts at creating its own daily weather reports can only be inferior.

The highest grade of weather reporting is aviation weather reporting. There are systems to more or less instantly download NOAA weather reports to airplanes in flight, with the information being automatically posted to moving-map displays. The pilot zooms out his map so that he can see two or three hundred miles ahead, and finds that bunch of red marks have just appeared a couple of hundred miles away. So he gets on the radio, and talks to ground control about an alternate route.

Similarly, the United States Geological Survey does the same thing for earthquake reports. They collect data from vast numbers of seismographs which are plugged into computer networks, and whose results are automatically processed by computers, and posted on a website.

http://earthquake.usgs.gov/

BentFranklin (profile)

April 3, 2013 at 10:50 am

“Welsh says that responsibility for accuracy falls where it always has: with publications, and with individual journalists.”

True enough. They are responsible for the behavior of their bots.

But contrast this with statements by people who don’t want the responsibility that comes when their torrent software delivers music to the IP police, because they say they didn’t know it would do that.

Friday
15:08	'Lol, No' Is The Perfect Response To LAPD's Nonsense 'IP' Threat Letter Over 'Fuck The LAPD' Shirt (0)
12:52	SCOTUS Needs To Take Up The Texas Age Verification Lawsuit (6)
10:54	The US Banning TikTok Would Play Right Into China’s Hands, And Destroy Decades Of US Work On Promoting An Open Internet (35)
10:51	Daily Deal: The Ultimate Adobe CC Training Bundle (0)
09:26	Congressional Testimony On Section 230 Was So Wrong That It Should Be Struck From The Record (21)
05:26	Apple Praised For Repair Reforms Only Made Possible By New Oregon Law It Tried To Kill (2)
Thursday
20:45	UK Prosecutors Apologize For Pursuing BS Charges Against A Photographer (12)
15:32	Palworld Creator Loves That Others Are Trying To Clone The Game (17)
13:02	Sextortion Is A Real & Serious Criminal Issue; Blaming Section 230 For It Is Not (25)
11:02	Cops Claim Body Cam Footage Of Wrong Address Raid Would Be 'Dangerous' To Release To General Public (38)

Programming The News: The Future Of Reporting Is Algorithms

from the I-for-one-welcome-our-new-fedora-clad-robotic-overlords dept

Comments on “Programming The News: The Future Of Reporting Is Algorithms”

graph theory

Big deal

Re: Big deal

Re: Re: Big deal

Re: Re:

Spit Take

Bad Newspaper Writing.

Of course there's a simple answer

Present Data As Data Without Trying to Verbalize It Unnecessarily.

Add Your Comment Cancel reply

Comment Options:

What's this?

The Techdirt Greenhouse

Trending Posts

Friday

Thursday

More

Tools & Services

Company

Contact

More

Programming The News: The Future Of Reporting Is Algorithms

from the I-for-one-welcome-our-new-fedora-clad-robotic-overlords dept

Comments on “Programming The News: The Future Of Reporting Is Algorithms”

Add Your Comment Cancel reply

Comment Options:

What's this?

Techdirt Daily Newsletter

The Techdirt Greenhouse

Trending Posts

Friday

Thursday

More

Email This Story

Tools & Services

Company

Contact

More