Ethics in Data Science

Increasingly, ethics and data science are topics in the same conversation—around the water cooler, in the board room, in the university classroom, around the dinner table, and on a leisurely evening walk through the neighborhood.

According to some in the data analytics field – it’s about time! They’ve been beating the data science ethic’s drum for years. Data scientist Mark Madsen is one of them. A long-time programmer and data cruncher, he began fielding requests from colleges and universities concerning their data-science curriculum. A consensus was reached that courses such as mathematics, statistics, and computing were crucial. But when he mentioned including subjects like history of science, philosophy, or communication, the conversations often fell off.

Madsen expressed his frustration in a December 2018 tweet—

“A half dozen universities asked me for help designing a data science curriculum at the time it was heating up. When I would suggest adding subjects like critical thinking, philosophy, history of science, communication, visual design, or anthropology, they would stop talking to me.”

But the tide may be turning.

Following recent data breaches and privacy scandals, universities including Cornell, Harvard, and Stanford have incorporated ethics courses into their degree programs.

“Stanford absolutely has a responsibility to play a leadership role in integrating these perspectives, but so does Carnegie Mellon and Caltech and Berkeley and M.I.T.,” says Jeremy Weinstein, a Stanford political science professor and co-developer of their ethics course. “The set of institutions that are generating the next generation of leaders in the technology sector have all got to get on this train.”

But why do we need to have the “ethics conversation”?

We are undeniably entrenched in a data revolution, to the tune of 2.5 quintillion records of data being created every day. Not only can both individuals and organizations store and analyze this data in massive quantities, said data has monumental influence, every day, on decisions and thought processes, product innovations and medical discoveries, to name but a few of the myriad of impacted sectors. And we’re talking life-altering, society-defining influence.

“Data-driven technologies also challenge the fundamental assumptions upon which our societies are built,” notes Margo Boenig-Liptsin, co-instructor of UC Berkeley’s “Human Contexts and Ethics of Data” course. “In this time of rapid social and technological change, concepts like ‘privacy,’ ‘fairness,’ and ‘representation’ are reconstituted.”

As new analytical tools and methods emerge, the benefit potential can barely be imagined. Incredibly exciting times, indeed, yet times that invite, nay beg, a sense of caution. And multiple, serious, answer-seeking, procedure-inducing conversations.

The mindset of the naysayers

Kalev Leetaru, who studies and writes about the broad intersection of data and society, draws these conclusions from his experiences with newly-degreed graduates heading off to the commercial world. “I hear very clearly the impact of academia’s disdain for big data ethics.” Leetaru notes it’s “a sad commentary on the academic world that trains the data scientists and programmers that are shifting the online world away from privacy.”

A graduating doctoral student from a top university shared openly and excitedly with Leetaru that her views towards the mining and manipulating of customer data were in alignment with that of the major Silicon Valley company who would be her employer. She expounded on how “IRBs were merely obstacles to be worked around or ignored.” She argued that users of social media platforms sign legal contracts granting those platforms the right to do whatever they please with personal data, regardless of the users’ understanding of the terms. Then, she proudly bragged “how her university research group had been mass harvesting social media data and sharing it widely,” noting that terms of service restrictions are merely optional recommendations.

However, when questioned about the harvesting of her personal information from sites she willingly registered on, her answer was a resounding no. The only articulated reasons appear to be a different set of standards for the “societal elite”—those who create and study platforms—and “ordinary people”—everyone else.

“Such an empathy gap is common in the technical world, in which people’s lives are dehumanized into spreadsheets of numbers that remove any trace of connection or empathy,” suggests Leetaru.

Does data science need a “Hippocratic Oath”?

An idea that’s gaining traction is the need for a data science ‘Hippocratic Oath.’ Like the “do no harm” pledge for medical professionals, there is support for the signing of one, or a set of pledges, manifestos, principles, or codes of conduct by those working with data.

At Bloomberg’s Data for Good Exchange (D4GX) in New York City in September 2017, the company announced a partnership with Data for Democracy and BrightHive to bring the data science community together to explore this topic. Here, volunteers from universities, nonprofits, local, and federal government agencies, and tech companies drafted a set of guiding principles that could be adopted as a code of ethics. The group re-convened in February 2018 at the San Francisco D4GX event.

These efforts and numerous others demonstrate a growing movement interested in the ethical aspects of technology. As stated by Lucy Erickson, “Successful efforts will require thoughtful and sustainable collaboration to apply insights and refine solutions., particularly related to advances in data science and artificial intelligence (AI) systems.”

In addition to our placement employee recruiting and contract/temp staffing services, RomAnalytics offers gig/project-based consulting assistance. We can design screeners and questionnaires, moderated, and honed-in on key insights of the data to provide exceptional client-ready reports. Check out our complete line of staffing services to the market insights, data analytics, and data engineering industries.