Tweeting in New York City
Data science can teach us to sympathize
Cities are growing quickly. In fact, the United Nations estimates that by 2050, 66 percent of the world’s population will live in urban areas. At present, more than 500 cities have a population of more than 1 million people. These cities have an average of one Adventist congregation for every 89,000 people!
Reaching People’s Tweets
Christ was relentless in His passion to reach people living in the cities of His day. And His method alone will give us true success in our effort to reach the city dwellers of our day. 1 Following Christ’s method means understanding and meeting people’s needs. And their tweets are saying what those needs are.
Data science and tweeting are now committed partners. Chris Moody, Twitter’s vice president for data strategy, describes Twitter as the largest public, searchable archive of human thought that’s ever existed. 2 This grand archive contains a trove of information on feelings, for those who know how to mine it. Researchers in this area of vast database investigation are called data scientists. Their discipline is defined as the study of the generalizable extraction of knowledge from data.3
One area of its research is sentiment analysis, the computational study of opinions, sentiments, and emotions expressed in text. 4 Sentiment analysis has turned its focus to the Twitter world because tweeting can tell us a lot about “the vagaries of human emotion.”5 That fact has not been missed by researchers at the University of Montemorelos, a university in Mexico where scholarship is particularly focused on understanding and advancing Christian principles and practice, and where they keep Christ’s soul-winning methods in sharp focus. This article describes how data science has been used at the university’s Global Software Lab to understand the needs of people in New York City, a metropolis that has pivotal significance in our church’s ongoing Mission to the Cities project.
Listening Closely to the Birds
Users on Twitter publish short messages of up to 140 characters called “tweets,” a name borrowed from the birds. This article reports on how sentiment analysis was used to discover individuals’ needs from tweets by means of machine learning, as researchers’ analysis translates the vagaries of human emotion into hard data. Tweets are classified as positive when they communicate a positive sentiment, such as happiness; as negative when a negative sentiment is attached to them (e.g., sadness); and as neutral when no emotions are implied.
By way of illustration, a tweet with a negative sentiment about rest may indicate that its author would like to take a break. Conversely, a conglomeration of positive tweets on vegetarianism in a particular area can indicate a trend toward satisfying the need for healthy food in that area.
Over a period of six weeks (September 22 to November 3, 2015), researchers collected 2,084 tweets from New York City, 1,633 of them bearing positive sentiments and 451 expressing negative sentiments. 6 Tweets with neutral sentiments were not collected.
Since tweets can be about any topic, the scope was limited to tweets containing one or more of 30 specified keywords. 7
Maps were generated in CartoDB with the collected datasets in order to analyze the results. Figure 1 shows the intensity of tweets, both positive and negative, for the 30 keywords in New York City. Areas in red have a higher number of tweets. Manhattan, New York City’s most densely populated borough, served up more than half the total number of these tweets (1,115 out of 2,084, or 53.5 percent).
Figure 2 shows the concentration of tweets with negative sentiments in central and southern Manhattan, with red indicating the highest concentrations of tweets with negative sentiments. Researchers, forward-thinking missionary workers, and city residents will all find their own interest in noting the four areas of highest concentration of negative tweets—Tribeca, Chinatown, Diamond District, and Tudor City.
Upbeat and Downbeat
Manhattan’s occasional concentrations of negative tweets should not be taken to mean that the people of that borough are all downbeat. As Figure 3 shows, they are positive about particular topics, in this case vegetarian food. All Manhattan’s tweets about this topic were positive, which may speak not merely to a vegetarian interest, but possibly to the quality of food and service that meet that interest.
On the other hand, “family,” another keyword, appearing in 28 percent of the total collected tweets, exhibited both positive and negative tweets in Manhattan, as seen in Figure 4. For what it’s worth, the highest concentration of negative tweets about family in Manhattan is located to the south.
Toward a New Way of Reaching People
Data science has the potential to help us understand the needs of people in big cities in an unprecedented way. Where it points to areas with a high concentration of negative sentiments, we may establish centers of influence to pinpoint and ameliorate those specific concerns. While the data may not explain why patterns of optimism or negativity occur, discovering those patterns is key to being able to ask the right questions and address the issues that matter to our public. Institutionally and individually, conference programs and personal neighbor visits can help investigate the sources of these sentiments with a view to mitigating or benefitting from them.
Our evangelistic efforts will become more intelligently focused as the Global Software Lab refines its studies with regard to ethnic and linguistic factors, and the development of tools that individual members may use to serve the gospel’s cause: through their neighbors’ interest in good food they may be better able to persuade them of the delectability of the Bread of Life.
- Ellen G. White, The Ministry of Healing (Mountain View, Calif.: Pacific Press Pub. Assn., 1905), p. 143.
- T. Simonite, “Twitter Boasts of What It Can Do With Your Data,” MIT Technology Review, Oct. 21, 2015. www.technologyreview.com/news/542711/twitter-boasts-of-what-it-can-do-with-your-data/; retrieved Nov. 10, 2015.
- V. Dhar, “Data Science and Prediction,” Communications of the ACM 56, no. 12 (2013): 64-73.
- B. Ling, “Sentiment Analysis and Subjectivity,” in N. Indurkhya and F. J. Damerau, Handbook of Natural Language Processing, 2nd ed. (Boca Raton, Fla.: Chapman & Hall, 2010), pp. 627-665.
- A. Go, R. Bhayani, and L. Huang, Twitter Sentiment Classification Using Distant Supervision (Stanford University, 2009); A. Wright, “Sentiment Analysis Takes the Pulse of the Internet,” New York Times, Aug. 23, 2009, p. B1.
- This ratio is consistent with findings that these modern mini-messages tend to be mostly upbeat. See John Bohannon, “Positive News for Twitter Users: Your Emojis Are Mostly Upbeat,” Science, Dec. 9, 2015: http://news.sciencemag.org/brain-behavior/2015/12/positive-news-twitter-users-your-emojis-are-mostly-upbeat?utm_campaign=email-news-latest&et_rid=17040727&et_cid=143396.
- List of keywords: addiction, Adventist, Bible, children, Christ, church, contamination, divorce, education, elderly, exercise, family, God, health, Jesus, obesity, peace, poverty, religion, rest, safety, salvation, Savior, stress, teenagers, teens, terrorism, vegetarian, violence, youth.
Germán H. Alférez is a researcher at the Global Software Lab, School of Engineering and Technology, Montemorelos University, Mexico.