Facebook Creates Open-Source Dataset to Lessen AI Bias

Facebook Ai Bias Featured

Facebook could use some good PR after the latest news that revealed the social media company allowed millions of customers’ data to be stolen. To right the ship a little, Facebook has created an open-source dataset that it believes will lessen the AI bias.

Also read: Facebook Data Leaked from Over 500 Million Users

Facebook Aim to Fix AI Bias

An existing problem with facial recognition has been the AI bias. While artificial intelligence tries to identify people through their unique facial features, it historically doesn’t do well with non-male, non-white individuals.

Facebook has set out to fix the AI bias with its open-source dataset it’s calling “Casual Conversations.” It includes 45,186 videos of more than 3,000 people having a non-scripted conversation. The participants are of different genders, age groups, and skin tones.

Facebook Ai Basis Crowd

Actors were paid to submit videos that included their own descriptions of age and gender to remove as much AI basis as possible. The Facebook team then labeled them by skin tone based on the Fitzpatrick scale that examines six skin tones.

Lighting was noted as well to show different skin tones in low-light situations. Audio and visual AI can be tested with the Casual Conversations dataset. The purpose isn’t to develop algorithms but evaluate the performance of the algorithms with different faces.

Two of the currently used datasets for facial recognition – UB-A and Adience – were composed mostly of white-skinned people. UB-A used 79.6 percent white people, while Adience used 86.2 percent.

Facebook Ai Bias Processing

Other than skin tone, the classifiers for IBM, Microsoft, and Face++ performed better with male faces than female voices in an MIT study. There were nearly no mistakes with white male faces, while darker female faces had an error rate of nearly 35 percent.

Casual Conversations aims to help evaluate the currently used algorithm. “Our new Casual Conversations dataset should be used as a supplementary tool for measuring the fairness of computer vision and audio models, in addition to accuracy tests, for communities represented in the dataset,” said Facebook’s team working on the project.

Casual Conversations Evaluations

Facebook used Casual Conversations to test the five algorithms that had won the Deepfake Detection Challenge in 2020. This had been developed to identify doctored media that was being posted.

Facebook Ai Bias Dataset

Despite being well-respected algorithms, they struggled with darker skin tones. The third-place winner in the challenge actually faired the best with Casual Conversations.

Facebook has already released the dataset to the open-source community. In doing so, it did note that it identifies genders of “male,” “female,” and “other,” explaining that it can’t identify those who identify as nonbinary.

“Over the next year or so, we’ll explore pathways to expand this dataset to be even more inclusive, with representations that include a wider range of gender identities, ages, geographical locations, activities, and other characteristics,” said Facebook of its efforts to eliminate the AI bias.

Read on to learn about Microsoft’s efforts to have facial recognition regulated to eliminate basis.

Subscribe to our newsletter!

Our latest tutorials delivered straight to your inbox

Laura Tucker Avatar

Read next

Octopuses possess roughly 500 million neurons distributed across their body, with two-thirds located in their arms rather than their central brain, meaning each arm can taste, problem-solve, and react to stimuli independently of whatever the octopus is otherwise paying attention to.
The Roman aqueduct at Segovia, built around the first century AD without mortar, still carried water into the 1970s, its 167 granite arches held together by nothing but the precise weight distribution of stones cut to fit each other within fractions of a millimeter.
When the SS Great Eastern laid the first working transatlantic telegraph cable in 1866, a message that had taken ten days by steamship suddenly crossed the ocean in minutes, and the financial markets of London and New York were forced, within a single trading week, to invent the modern concept of synchronised global price.
The Big Ear telescope was scanning at 1420.4056 megahertz on the night of 15 August 1977, the exact frequency at which hydrogen atoms vibrate across the universe, because Giuseppe Cocconi and Philip Morrison had argued years earlier that any species trying to be found would broadcast on that channel — and then, for 72 seconds, something did.
In 2016, archaeologists dated two rings of snapped stalagmites in France’s Bruniquel Cave to 176,500 years ago, evidence that Neanderthals had walked 336 metres into darkness with fire and built architecture deep underground long before modern humans reached Europe
Otto von Bismarck was 74 when Germany adopted the world’s first national old-age social insurance program in 1889, setting the pension age at 70 after years of fighting socialists with bans, laws, and a promise few workers would live long enough to use
When cosmonaut Valeri Polyakov stepped out of his Soyuz capsule in March 1995 after 437 consecutive days aboard Mir, doctors recorded him at several centimetres above his pre-flight height, and his spine had become so unaccustomed to gravity that the recovery team carried him to a chair rather than risk the compression of letting him walk.
When Bell Labs engineer Karl Jansky pointed a rotating antenna at the sky in 1932 looking for sources of transatlantic radio static, he kept picking up a faint hiss that peaked every 23 hours and 56 minutes, and he eventually realized he had become the first human to hear the center of the Milky Way.