Tech Talk: "Fair AI" mit Annalisa Cadonna von Crayon

Tech Talk: "Fair AI" mit Annalisa Cadonna von Crayon

Hi everyone, my name is Annalisa Cadonna, I’m a Senior Data Scientist at Crayon in the Center of Excellence for Data and AI. Today I’m talking to you about Fair AI and what our responsibility as data scientists is.

Just a couple of words about me, I have a background in biomedical engineering and mathematics and statistics and I’ve done research in data science, applied to different fields for a long time. A year ago, I left academia and I moved to Crayon, where I work as a Senior Data Scientist in the Data Insights group and we develop end-to-end AI solutions. One of my passions and one of the things I do at Crayon is trying to develop fair AI solutions.

We all know, that AI is experiencing a really big growth. This has been enabled by different factors. First of all, we have large volumes of data. Second, cloud technologies allow even smaller medium companies to have access to big computational resources. And third, advancements in AI. Think for example about advancements in natural language processing and in computer vision. So AI can unlock capabilities that until now were not thought possible, is present in everyday life – think Alexa or Siri or your mobile banking app giving you suggestions – but it’s also one of the top priorities in the corporate agenda and also in the public sector agenda.

So, what are some of the challenges in adopting AI and what are some of the risks that could happen, what are some of the risks about not using AI in an informed way. I would like to start with a couple of examples. The first example maybe you are familiar with, we are talking about the public employment service in Austria AMS. They had a problem, they needed to allocate the resources for the job seekers and they decided to use AI to do so. They developed a logistic regression model, applied it to historical data and came up with a probability for each job seeker to be successfully employed. They then divided job seekers in three groups, based on these probabilities with the assumption that people with a high probability of being employed didn’t need help. People with a low probability shouldn’t have help, because the resources would be – let’s say – wasted. And so they decided to allocate the majority of the resources to the people in the middle group. What happened? The algorithm was very biased. Why? Of course the model was trained on historical data. And some groups – because of societal bias – have been historically less successful in being employed. The algorithm picked up this bias, amplified it and said that women or citizens that are not from the EU or old people were not likely to get a job. Even if in all the other characteristics and features they were the same. The second example is something that happened during the COVID-19 pandemic. As we all know, the pandemic brought many challenges, one of these challenges was that a student couldn’t go in person to school and could not take exams. So, the grade in the final exam of highschool in the UK is very important and helps to get admitted to colleges. Someone thought that teachers might be biased. So, they decided to use an algorithm to predict the final grade. The algorithm was using – of course – the student’s past performances, but also data about the school that the pupils were attending. In the end, pupils that were attending public schools maybe in disadvantaged districts had a grade assigned that was less than what their teacher had predicted, while pupils from good private schools had a grade that was higher than what their teacher had predicted. Luckily in the case of the UK, this algorithm was not used in the end. Both of these are examples of failure of AI.

What are some of the challenges when we implement an AI solution? It’s not only a technical challenge. The first challenge is the societal challenge. One needs to understand what impact the AI solution will have on the society, on the user and on everyone that is involved. The second – of course – is technical. We need to check: how is our data? Is the data biased or not? Is our model amplifying the bias in the data? And then we need to make sure that we create fair models. This is really tricky because usually, when increasing fairness, you decrease performance of the model. Third – as usual with everything in data science – it’s important to monitor. And so, while we monitor performance metrics and business metrics, it’s also important to monitor fairness metrics. The last is a legal challenge. Many times, especially when the outcome of the AI solution impacts people, there are regional and global laws and regulations that we need to make sure our AI system respects.

So, how do we ensure fairness at Crayon? First of all, fair AI is a field, that is new and there is a lot of research. There is theoretical research and there is a gap between theory and practice. We are collaborating with universities and other companies in a project called „fAIr by design“, sponsored by the FFG – the Austrian Innovation agency – and we are trying to use state of the art methods to develop AI solutions that are fair by design. So not only the AI development, but also before and after.

We usually use four main steps. The first step – it seems obvious, but as in the examples we saw before, it’s not – we need to identify the potential for discrimination. We need to continuously communicate with the stakeholders, with business, with everyone involved and identify if there is a potential for discrimination and where this potential lies. Second, we need to check the data and we check the fairness in our models. At this point, we are happy if they’re fair, but if they’re not, we need to find ways to ensure model fairness. This is usually done in three stages: through pre-processing of the data, in processing in the model – for example parameter tuning – or post-processing. Since the data could change over time, or the model could not be right for the data in the future, we need to continuously monitor also the fairness metrics.

How do we monitor or check fairness? Since we are data scientists, we like metrics – so together with the usual performance metrics, like AUC, precision, recall and so on, and business metrics – usually in terms of revenue for a business or saving – it’s also important to consider also fairness metrics. There are two families of fairness metrics. The first is demographic parity. The idea of demographic parity – let’s think about a classification model with a positive outcome and a nonpositive outcome, for example being hired and not being hired – in two groups, both groups should have the same proportion of people being hired. Our model should predict the same proportion. This is not considering the performance of the model in the two groups or the error. Then there is a big class of metrics that I called here for simplicity performance balance, that is saying „OK, the error in one group and in the other group should not be very different.“ This is something that is very important in computer vision.

There are a few packages in the literature, but the majority of the packages and the libraries do auditing. So you can use a model and check the fairness metrics and see if your model is fair or not. The two main packages for actually mitigating bias are AIF-360 from IBM and FairLearn, which was created by Microsoft. These are the two packages that also have a very active community.

Now, we talked about the challenges, but building AI solution also brings a lot of benefits. First of all, it increases trust. Not only customer towards business, but also the trust of the population towards AI in general. It can increase and help the adoption of AI. Second, it can have a positive social income. Last but not least, if companies start already thinking about the fairness of their solutions, they will prepare themselves for upcoming legal requirements. One big example is the proposal for AI regulation that was released in April by the european commission.

So, what are our responsibilities as data scientists? First of all, a lot of time when we talk about data science, the focus is on technology, on the model and not on the people that are building data science. We as data scientists are always – or often – the ones that are called to have a judgement or have judgement calls on the use and applications of data and AI. So, we cannot stay silent and data scientists: every time we work on an application, we should bring this topic up, we should ask questions regarding the potential for discrimination! As in other aspects of data science, we should explain, we should educate the businesses to have – for example – a data driven mentality and should not only consider business metrics, but also fairness metrics. And keep ourselves up-to-date with the upcoming regulations and also the community around fairness in AI.

And I would like to finish with the motto of Crayon, that is: „We believe in the power of technology to drive the greater good.“

Thank you! These are my contacts, please feel free to reach out!