Nieman Journalism Lab

Finding tools vs. making tools: Discovering common ground between computer science and journalism

Posted: 14 Feb 2013 10:42 AM PST

The second Computation + Journalism Symposium convened recently at the Georgia Tech College of Computing to ask the broad question: What role does computation have in the practice of journalism today and in the near future? (I was one of its organizers.) The symposium attracted almost 150 participants, both technologists and journalists, to discuss and debate the issues and to forge a multi-disciplinary path forward around that question.

Topics for panels covered the gamut, from precision and data journalism, to verification of visual content, news dissemination on social media, sports and health beats, storytelling with data, longform interfaces, the new economic landscape of content, and the educational needs of aspiring journalists. But what made these sessions and topics really pop was that participants on both sides of the computation and journalism aisle met each other in a conversational format where intersections and differences in the ways they viewed these topics could be teased apart through dialogue. (Videos of the sessions are online.)

While the panelists were all too civilized for any brawls to break out, mixing two disciplines as different as computing and journalism nonetheless did lead to some interesting discussions, divergences, and opportunities that I’d like to explore further here. Keeping these issues top-of-mind should help as this field moves forward.

Tool foragers and tool forgers

The following metaphor is not meant to be incendiary, but rather to illuminate two different approaches to tool innovation that seemed apparent at the symposium.

Imagine you live about 10,000 years ago, on the cusp of the Neolithic Revolution. The invention of agriculture is just around the corner. It’s spring and you’re hungry after the long winter. You can start scrounging around for berries and other tasty roots to feed you and your family — or you can stop and try to invent some agricultural implements, tools adapted to your own local crops and soil that could lead to an era of prosperity. If you take the inventive approach, you might fail, and there’s a real chance you’ll starve trying — while foraging will likely guarantee you another year of subsistence life.

It’s certainly fine to use found tools, especially if they solve the problem without a hitch, but too many practicing journalists are still operating in the “forager” mindset. They use only the available tools they can scrounge up to solve the immediate problems related to finding, making sense of, and presenting news information. We heard from David Clinch about how his startup, Storyful, uses readily available tools for vetting and verifying online social media, rather than developing those kinds of technical capabilities in-house. For them, the innovation is in how people use those tools in an overall process. We also heard from Mo Tamman, a veteran data journalist now at Reuters, who initially referred to himself as a “tool whore” but backtracked to settle on the label “tool omnivore,” suggesting he would grab whatever widget would help him nail the story. Nick Lemann, dean of the Columbia University Graduate School of Journalism, called journalists “deadline epistemologists” to communicate the time pressure journalists feel in constructing their knowledge of the world.

Researchers and technologists generally have a different mindset. They engage in forging the implements that let the enterprise scale and flourish. They are often more interested in generalizing and productizing rather than individual one-off stories. Academics just don’t have the same kind of deadline pressure as (most) practicing journalists. And as, I believe, Larry Birnbaum said: Journalists think in terms of stories, technologists think in terms of products. Stories are specific but products are generalizable.

This overarching sentiment seems to lead to the technologists (and researchers) to want to engage with cultivating tools. In the very first session with Irfan Essa and the journalism legend Phil Meyer the message for journalism education was loud and clear: Don’t just teach people how to use tools — teach them how to make them. Imagine a computational fluency and aesthetic akin to the best journalist writing. What might that look like?

This difference in thinking about time horizons opens up a raft of important questions. If journalists should start cultivating computational tools, how should j-schools approach and teach that? Should tools emerge out of universities or newsrooms that subsidize their development, or are startups a better model for innovative tool development? And once you start building tools, how do you make sure the good ones diffuse into widespread practice? Teaching user-centered design is one place to start, as is learning more about computational thinking. And while data and computational literacy skills will increasingly be needed, not all journalists will need to code. The same way there is specialization among other media within the newsroom, building and coding tools for gathering, assessing, and presenting information need not be universal journalistic activities.

The influence engine

Phil Meyer noted that the product of journalism is not eyeballs, but influence. In particular, he noted that trust was the main currency journalists use to achieve that influence. He warned that it’s not enough to just get the information into readers’ hands — it also has to get into their heads. The accountability mission of public affairs reporting is, when you stop to reflect on it, oriented towards influencing the government, in a positive way. It’s all about influence.

Alberto Cairo, when speaking about telling stories with data, brought up a similar idea — how storytelling is often framed as a persuasive, convincing, or impactful enterprise. He suggested that influence might not be a core journalistic value, but it is nonetheless a prevalent one. Indeed, Columbia is beginning to fund research on the study of “impact” — or, more plainly stated, the “influence” — of journalism.

In the session on news dissemination on social media, Gilad Lotan noted the push to quantify influence on social networks. It’s a hard problem, given the myriad exogenous influences that we all experience on a daily basis: friends, social media, television, individual experiences, and so on. There’s money to be made in understanding how or when to present specific content in order to optimize its exposure. And ad revenue from wide exposure is certainly one thing — but other types of influence and impact are less well rewarded by the market, as Duke’s Jay Hamilton alluded to in his session on the media economy.

All of this talk of influence and impact smacks of an opportunity for computational journalism. With growing data sets and better instrumentation across media, could we finally begin building a computational influence engine? Such an engine would know how to optimize not just for revenue, but also for diverse exposure to media and an informed public — as well as for accountability impact. Engineering and biasing computational news media algorithms towards “good” exposure could potentially also help correct for known human biases of perception and cognition.

Narrative analytics or analytic narratives?

Several of the speakers at the symposium spoke of the tension between narrative and analytic communication. “We have an impossible and problematic profession because we’re trying to marry narrative and analysis which are fundamentally incompatible…but that’s what makes it fun,” Lemann said. Tamman, the data journalist from Reuters, also spoke of the “analytic spines” that he uses to hold up the narrative part of his data-driven stories. Let’s call this the narrative-dominant frame.

But not everyone wholeheartedly shares the idea that narrative should dominate in the expression of news information. Cairo, a visual journalist who works on data storytelling in infographics, cited a number of objections that people have posed to him over the years: Stories can overweight the value of anecdotes, they can impose narrative on data that is not necessarily complete or cohesive, and they tend to privilege what the designer wants the user to see.

There are other methods that can be used to structure data. Meyer noted that theories are often used to structure the communication of scientific data. And computer scientists are familiar with the idea of using models to describe and communicate data. Models allow for the abstraction of data so that it can be efficiently and effectively computed on. We can call this the analytic-dominant frame.

This brings us back to the raw cultural difference of the value of “theory” or “model” (i.e. understanding the central tendency and abstraction of data) versus the “anecdote” or “outlier” that is so important to journalists feeling they’ve got a good story to tell. We may be just at the beginning of understanding the benefits and tradeoffs of the narrative-dominant frame versus the analytic-dominant frame, but it’s certain that the cultural dilemma of how news communication is approached underscores a central challenge in integrating computation and journalism.

Looking to the future

I’ve barely scratched the surface here, and if I’ve answered any questions in this article, it’s probably by accident. If, on the other hand, your curiosity is piqued, then I would urge you to keep your eyes open for the third installment of the Computation + Journalism Symposium in 2014. We hope that, over time, this burgeoning and multi-disciplinary assembly can grow into the place where journalists and technologists come to meet each other and trace a combined path towards a better news information environment. A special thanks goes out to the interdisciplinary organization and sponsorship from Columbia, Duke, Georgia Tech, Northwestern, Stanford, and the National Science Foundation which set the stage with a collective mindshare and ability to convene and attract both computer scientists and journalists.

Painting: “Stone Age,” by Viktor M. Vasnetsov, 1882-85.

The newsonomics of zero and The New York Times

Posted: 14 Feb 2013 07:25 AM PST

Perhaps zero is the loneliest number.

That’s the number The New York Times Co. — the United States’ second largest newspaper company — ended up with 2012, as it reported revenues last week. The most accurate number, when the year’s anomalous 53rd accounting week is taken out, is 0.3 percent up. Still, that’s a variety of zero, especially when we look at NYT Co.’ last quarter, where it ended down 0.7 percent.

Zero would seem to indicate not much happening here. Zero is an unsexy digit for corporate revenues — flat, as in pancake. It’s not growth, and it’s not death-spiral decline — both phenomena much easier to describe. So zero is causing lots of problems for those — financial analysts, investors, and media beat reporters — trying to describe the state of legacy print media today. But, in fact, the Times’ zero is noteworthy, and tells us much about the strategies — and straits — of our publishing times.

The New York Times Co.’s zero, in fact, is actually a milestone number. It’s the first increase, however meager, in overall revenues since 2006, when it managed a 1.8 percent increase in revenues.

Check out the chart below and you can see, year by year, the sharp decline in NYT Co. revenues. Back in 2007, advertising and circulation revenues totalled almost $3 billion. The decline since then has happened almost entirely in advertising.

If you take out the $276 million earned in 2007 by the company’s Regional Newspaper Group (sold off in 2012), the Times’ remaining flagship and New England properties would have produced about $1.7 billion in advertising back then. That means ad revenues are 52 percent of what they were seven years ago — a number highly consistent with overall newspaper ad loss in the U.S. Since hitting a top of almost $50 billion in ad revenues in 2005, industry ad revenues (print and digital) will come in around $22B for 2012. Yes, that’s a $28 billion difference.

New York Times Co. revenues 2007-2012 (in thousands)

	Ad revenues	Circulation revenue	Change in total revenue
2012	$883,221	$936,264	+0.3%
2011	954,531	862,982	-2.0%
2010	1,171,200	931,493	-2.7%
2009	1,336,291	936,486	-17%
2008	1,771,033	910,154	-7.7%
2007	2,047,468	889,882	-3.7%

* Data from 2007-2010 includes New York Times Regional Newspaper Group, which was sold in 2012. That revenue amounted to about 10-15 percent of the Times' News Media Group's revenues in most years. The 2009 decline is recession-worsened.

Circulation revenues, on the other hand, have been comparatively steady. While the number of copies sold has declined, increased pricing has kept that revenue relatively consistent. In fact, with its just released 2012 report, the company’s circulation revenue is at all-time high, considering that the 2009 number included those regional dailies. It’s not hard to imagine it topping the $1 billion mark. In fact, if the Times can repeat its 2012 growth rate in 2013, it will reach that pinnacle for the first time.

Zero isn’t boring: It took the Times a hell of a lot of work to get to zero. All that activity is a reflection of the worldwide newspaper model in transformation. It’s a struggle the Times shares with many other newspaper companies: Zero turns out to be an important number, a plateau on a long, downward journey, a stopping point on a mad path to finding growth somehow, somewhere.

Most others aren’t there yet; they still aspire to zero. McClatchy just reported its revenues were down 4.9 percent for 2012; A.H. Belo was down 5 percent. Gannett, the largest U.S. newspaper company and second largest globally, was down more than 2 percent for the year. Of the public companies, News Corp.’s publishing activities shared a flattish resemblance to the Times, though the imprecision of its reporting provides only hazy confidence in that number. Only a few companies report, privately, that their circulation revenue gains are now exceeding advertising losses. Significantly, all of them have put good metered paywalls up and have adopted best practices to maximize reader revenue.

It’s important to differentiate between a revenue percentage and a profit percentage. As revenues of these companies have declined, they’ve kept cutting expenses. Consequently, either their meager profits haven’t dropped, or their small annual losses haven’t worsened. Cutting is a only short-term way to stay to get or stay in the black; top-line revenue growth is essential if newspaper-based companies are to re-find their mojo.

NYT Co.’s zero is driven by its two-year-old digital circulation strategy, which has allowed a nice plus sign for circulation to balance a corresponding minus for advertising. For 2012, Times Co. circulation revenue was up 8.5 percent. Advertising revenue was down 7.5 percent. The company’s major focus on all-access and digital circulation is behind that 8.5 percent number.

Figure that the increase — in dollars, $73 million — is driven by two big plays. First, the Times’ increased pricing on its seven-day and Sunday print/digital products — now more than $800 at full price outside New York City. Second, 640,000 digital-only customers at the Times and 28,000 at the Globe.

On the ad side, the company has had to work very hard to bring in 92.5 percent of the ad revenues it brought in in 2011. On its year-end call for analysts, “headwinds” were the favorite metaphor for the ad decline. CEO Mark Thompson, in his debut performance as the Times’ emcee, correctly named such “structural” headwinds as programmatic buying, ad exchanges, overall downward pricing pressure, and the faster movement of national advertising — about 75 percent of the Times’ business — to lower-cost digital from higher-priced print.

He also cited “cyclical” headwinds, which usually means the damage that down economies have on ad spending. The latter is no longer true, as the U.S.’ mildly up economy is helping generate an overall ad spending increase in the neighborhood of 3 percent. In its intense fight to hold on to as much of that lucrative print advertising and to forge forward as a premium buy in digital, it managed to lose only 7.5 percent.

Overall, the zero plateau provides at least the illusion of a resting point. A point from which to figure out how to find growth, or at least how not to go negative again. That’s the company Mark Thompson has inherited; his job: find life above zero.

So let’s drive forward three years, keeping in mind the Times’ current revenue split, and see where the Times is likely to be. Those splits: 48 percent circulation, 45 percent advertising, 7 percent other.

Circulation revenue

Circulation, better now called reader revenue, is the star. The Times itself (subtracting the Globe’s 28,000) grew its overall number of digital-only subs to 640,000, a 57 percent increase over its end-of-2011 total of 406,000. That’s a torrid pace. It’s also a pace it will find difficult to keep up. Already, the Times is at about two percentage of its domestic unique visitors, a number that includes lots of duplication given multiple digital devices per person. Early on, I’ve pegged its natural core, pay-one-monthly price for the Times at about 3% of those uniques. I think the Times could climb to somewhere around 900,000 to 1 million digital-only subs. That would be a huge number and another milestone, a number that would significantly exceed its paid print circulation.

But even with its industry-leading success (in a league only with The Wall Street Journal and Financial Times), the law of big numbers makes the next few years tougher. That’s why we see the closing of purposely-left-open leaks in the paywall and an expanded, better promoted corporate digital circulation program. That’s why “global” is often the first or second word (alongside “video”) coming out of Mark Thompson’s mouth. About 10 percent of the Times’ digital subs come from outside the United States. With 95 percent of the potential market outside our borders, global digital circulation is a priority — but one likely to grow at an agonizingly slow pace, despite the Times’ moves (“The newsonomics of the Times’ global digital strategy”) into China and Brazil.

In last week’s analyst call, Denise Warren, who heads both advertising and the digital business, talked about premium digital products and dropped hints about a coming Times product aimed at youth.

My sense is that the next big digital frontier is, indeed, new products. A one-size-fits-all newspaper (whether print or digital) is Phase 1. Phase 2 means a set of digital products limited only by the imagination (“The newsonomics of 100 products a year”). These are topically, generationally, and geographically appropriate products. Politico, with its Pro line (“Politico Pro, one year in”), is one model. Considering the Times’ resources, imagine what it could do with Pro-type products. Paywalls, backed by the Times’ own technologies or offerings from the likes of Press+, TinyPass, Piano Media, and Cleeng, enable all sorts of one-time buying and sampling. In this game, a handful of the big magazine publishers are ahead in testing, and selling, one-off products.

Without these kinds of new revenue products, expect circulation revenue growth to slow, perhaps dramatically. One reason is the legacy side of the business. While the Times has impressively priced up print subs, it’s losing about 6 percent of subscribers a year. In the short term, the money is good enough to make those economics work. Over a three-year period, it’s trouble. A potential loss of another 20 percent of print subscribers would result in both reduced high-dollar subscription revenue and a markedly reduced rate base for print advertising. Over time, it will get harder and harder to make up lost revenue by charging a smaller and smaller set of print subscribers more money.

Perhaps, the Times can charge its digital-only subscribers lots more. Its top price for digital access (web plus smartphone plus tablet) is $455 a year. That’s about 57 percent of its top print price, and of course smartphone-only or tablet-only users pay less. If you want complete digital access from The Wall Street Journal, at full rate, you’ll be paying about 83 percent of the Journal’s six-day print price. Will that gap close? It’s an open question how much digital subscribers to the 2016 New York Times will be asked to pay.

Given most of these trends, we can figure that the Times’ circulation revenue might be 15-18 percent higher than what it is today then — but it will require significant further innovation in new products to reach that number.

Ad revenue

Now let’s take circulation’s evil twin: advertising. Take out 2012′s 53rd week, and the fourth quarter was down 10.2 percent in print. More troubling: Digital ad revenue was down 1.7 percent for the quarter. The Times has said it expects the first quarter, which we are midway through, to be similar to the last one.

If ad revenues were down 7.5 percent in 2012, recent history suggests a similar decline this year. While it’s possible the slope could moderate as the base dwindles, there is no end in sight for print ad loss. The issue for the Times is that national ad revenue is moving more rapidly digital than local, and it’s getting a smaller and smaller share of fast-growing digital ad spending.

Put it together, and we can expect ad revenues to be down about 25 to 28 percent from current levels in three years.

Put ad and circulation revenue together, and it looks like a widening gap in the wrong direction — the loss in ad revenue is 10 points larger than the gain in circ revenue. There’s the core of the core problem. Zero, in that scenario, would be a high point. That’s the world Mark Thompson faces.

In a world of declining revenues, cost-cutting remains one important option. While the Times newsroom has been protected far more than its metro newspaper peers, the Times has been cutting costs for half a decade, including its recent elimination of 30 more editorial positions. The Times plans low single-digit expense cuts again in 2013, but that pace won’t be enough to maintain a small profit by 2016, if ad and circ forecasts are right. In fact, the Times would need to transform its cost basis through heavy use of technology or risk deep cutting into its world-class core asset — its 1,000-person-plus newsroom — to keep itself in the black.

The most hopeful projection would be that the Times finds new substantial sources of revenue. Maybe it re-cracks the code and finds significant new digital ad money. A set of new digital products — some reader-paid, maybe some sponsored — is one possibility. Its events business — currently growing, but not big enough to be a game-changer — is another. The Times is not making a major move into marketing services, as many regional newspaper companies are, so that promising revenue stream won’t help the company. It’s not planning much in the way of acquisitions, says Thompson, even as its cash position outdistances its debt for the first time in a long time. We’ll have to see what the new CEO has up his London-starched sleeve.

This zero isn’t an alternative product, like Coke Zero. It’s a core number in the rebuilding of publishing businesses. The Times and its move — up or down — from zero will be much watched by news publishers worldwide.

Collage made up of photos of zeros by chrisinplymouth, used under a Creative Commons license.

YULI AKHMADA

Jumat, 15 Februari 2013