In this episode, Kyle reviews what we’ve learned so far in our series on Fake News and talks briefly about where we’re going next. About the “Data Skeptic” Podcast
The Data Skeptic Podcast features interviews and discussion of topics related to data science, statistics, machine learning, artificial intelligence and the like, all from the perspective of applying critical thinking and the scientific method to evaluate the veracity of claims and efficacy of approaches.
Two weeks ago we discussed click through rates or CTRs and their usefulness and limits as a metric. Today, we discuss a related metric known as quality score. While that phrase has probably been used to mean dozens of different things in different contexts, our discussion focuses around the idea of quality score encountered in Search Engine Marketing (SEM). SEM is the practice of purchasing keyword targeted ads shown to customers using a search engine. Most SEM is managed via an auction mechanism – the advertiser states the price they are willing to pay, and in real time, the search engine will serve users advertisements and charge the advertiser. But how to search engines decide who to show and what price to charge? This is a complicated question requiring a multi-part answer to address completely. In this episode, we focus on one part of that equation, which is the quality score the search engine assigns to the ad in context. This quality score is calculated via several factors including crawling the destination page (also called the landing page) and predicting how applicable the content found there is to the ad itself. About the “Data Skeptic” Podcast
Kyle interviews Steven Sloman, Professor in the school of Cognitive, Linguistic, and Psychological Sciences at Brown University. Steven is co-author of The Knowledge Illusion: Why We Never Think Alone and Causal Models: How People Think about the World and Its Alternatives. Steven shares his perspective and research into how people process information and what this teaches us about the existence of and belief in fake news. About the “Data Skeptic” Podcast
A Click Through Rate (CTR) is the proportion of clicks to impressions of some item of content shared online. This terminology is most commonly used in digital advertising but applies just as well to content websites might choose to feature on their homepage or in search results. A CTR is intuitively appealing as a metric for optimization. After all, if users are disinterested in some content, under normal circumstances, it’s reasonable to assume they would ignore the content, rather than clicking on it. On the other hand, the best content is likely to elicit a high CTR as users signal their interest by following the hyperlink. In the advertising world, a website could charge per impression, per click, or per action. Both impression and action based pricing have asymmetrical results for the publisher and advertiser. However, paying per click (CPC based advertising) seems to strike a nice balance. For this and other numeric reasons, many digital advertising mechanisms (such as Google Adwords) use CPC as the payment mechanism. When charging per click, an advertising platform will value a high CTR when selecting which ad to show. As we learned in our episode on Goodhart’s Law, once a measure is turned into […]
The scale and frequency with which information can be distributed on social media makes the problem of fake news a rapidly metastasizing issue. To do any content filtering or labeling demands an algorithmic solution. In today’s episode, Kyle interviews Kai Shu and Mike Tamir about their independent work exploring the use of machine learning to detect fake news. Kai Shu and his co-authors published Fake News Detection on Social Media: A Data Mining Perspective, a research paper which both surveys the existing literature and organizes the structure of the problem in a robust way. Mike Tamir led the development of fakerfact.org, a website and Chrome/Firefox plugin which leverages machine learning to try and predict the category of a previously unseen web page, with categories like opinion, wiki, and fake news. About the “Data Skeptic” Podcast