Kathy Baxter is the Principal Architect for Ethical AI Practice here at Salesforce. Ethics and AI is a complex topic but Kathy is truly an expert. She started researching it in 2016 when she first learned how toxicity and bias can end up creeping into AI.
Today, Kathy is joining us to talk all about Ethical AI. In our conversation, we also discuss how technical products can use ethics and AI. Join us to learn all about this interesting facet of our industry.
Show Highlights:
- How Kathy got into this line of work.
- What an AI-first company is.
- How toxicity can creep into chatbots.
- Why ethical AI is so important in a company like Salesforce.
- Unique things Salesforce uses AI for.
- AI considerations developers should think about from the beginning of the development of an application.
- Where AI can’t replace human intuition.
Links:
- Kathy on Twitter: https://twitter.com/baxterkb
- Kathy on LinkedIn: https://www.linkedin.com/in/kathykbaxter/
- Kathy on Mastodon: https://mastodon.social/@baxterkb
- Responsible Creation of AI Trailhead: https://trailhead.salesforce.com/content/learn/modules/responsible-creation-of-artificial-intelligence
- Building Ethical and Inclusive Products Trailmix: sfdc.co/EthicalandInclusiveProducts
- Salesforce Trusted AI Site & Resources: salesforceairesearch.com/trusted-ai
- AI Ethics Maturity Model: sfdc.co/EAIMM
Episode Transcript:
Kathy Baxter:
This is about creating products, technology, and designing it around humans, making sure that the technology is designed to work for us, not trying to make humans work for the technology.
Josh Birk:
That is. Kathy Baxter, principal architect for Ethical AI practice here at Salesforce. I’m Josh Birk, your host with the Salesforce Developer Podcast. Here on the podcast you’ll hear stories and insights from developers for developers. Today we sit down and talk with Kathy about the very complicated topic about ethics in AI and how technical products can use it. We will continue on as we often do with her early days.
Kathy Baxter:
Yeah. I’ve worked as a user experience researcher at Oracle, eBay, Google, and that was my role when I began at Salesforce as well.
Josh Birk:
Oh, okay. Well then when did you first start researching Ethical AI?
Kathy Baxter:
In 2016, when Mark Benioff said that we were going to become an AI first company, I was working in Service Cloud and the first AI product service cloud was going to release was a chat bot. For our listeners that might remember, 2016 was also the year that Microsoft Tay came out, the Nazi loving racist chat bot.
Josh Birk:
Yes.
Kathy Baxter:
That had to be pulled from Twitter in less than 12 hours. Immediately, I thought, “Oh God, how do we not do that?” Started doing research into natural language processing and natural language generation and how toxicity and bias can end up creeping into AI. Then, started researching how might you prevent it, and then shared that information back with the product team. Then I started reaching out to the user researchers working on Einstein, our AI platform in other clouds, to start talking about what were the potential unintended consequences and harms for the different kinds of AI that were being built in each one of our clouds.
That started me down this pathway and did that for about 18 months. Then, this was on the side in addition to continuing to be a user experience researcher for Service Cloud as a whole. It was really interesting and I was really passionate about it. It was very important to me to do this work. In April of 2018, I saw that our then chief scientist, Richard Socher, was going to do the closing keynote for an event that I was attending. I had been sending him emails about the work that I was doing because his team, the research scientists were building a lot of the models that were being used in the clouds. I wanted to figure out how might we work together. Here I am, some rando in the company, just dropping emails-
Josh Birk:
Just dropping in on the Chief Science, yeah you know.
Kathy Baxter:
Yeah, as one would do, and didn’t hear back, but I had been looking for him at the conference or the event. Wasn’t seeing him. It was a pretty small event, like 200, 250 people. It was easy to be able to spot somebody you’re looking for. It was getting really close to the closing keynote, and I walked up to the registration desk and asked, is Richard Socher still going to do the closing keynote? I haven’t seen him. They looked at me like a total rando stalker. Who are you? I whipped out my Salesforce badge and was like, “Oh, we work together. I was supposed to be having coffee with him.” Not very ethical of me. They were like, “Oh no, he’s sick. He’s going to come in five minutes before he does the keynote and then or the closing keynote. then after he finishes, he’s going to hop off stage and leave immediately.”
I’m like, “Oh no, he’s not. Fan boys and fan girls are going to be mobbing him.” The only chance I have of getting him is before any other person sees him. I stood behind a cement pillar and where I could see the door and see the registration desk, and I just checked my emails until I saw him come in. When his EA was registering him and getting his badge and everything, I popped out and I had my three minute elevator pitch ready and told him about the research. He was like, “Cool, have you thought about doing this full-time as a research scientist?” As you alluded to in the beginning of my background, I’m not a computer scientist or a data scientist, I am a psychologist.
No, I had not thought about doing this as a research scientist. He said, “Well, if you change your mind, let me know.” And walked away. I’m like, oh, maybe that is an option. I put together a job description and pitched it to him in August. He said, yes, this is what we need. We need somebody full-time focused on this. Then he pitched it to Mark Benioff and Mark Benioff said, “Yes, this is what we need.” Six days later I got the email saying, “Hey, congratulations. You’re on my team now.”
Josh Birk:
Wow.
Kathy Baxter:
Yes. I became the first person whose role was solely focused on ethical creation and design of products. And then later that same year, Paula Goldman, our chief ethical use officer, was hired as well. That’s how I made the transition from user experience research to AI ethicist.
Josh Birk:
Wow. A few things to note. First of all, for people who are unfamiliar with Salesforce culture, this dropping emails on C-level people is actually kind of a weird Salesforce thing that we do sometimes.
Kathy Baxter:
Yes.
Josh Birk:
also I love because I’ve had a few other stories on the show where it’s basically you get the job because you’re just going to be the squeaky wheel about it. We have product managers who are product managers because they were the people always in the team’s faces about what was wrong with the product and where the product was going. It’s like, well, if you talk about it too much, sometimes it becomes your responsibility.
Kathy Baxter:
Yeah. That’s one of the things that I do really love about the Salesforce culture is that it’s really supportive of this entrepreneurial spirit. I’ve worked at other companies that are much more bureaucratic. They’re like, oh, sorry, we don’t have a job ladder for that. Sorry. You’re just shooting yourself in the foot by saying, “Oh, there’s a need here, but we don’t have some kind of documented HR process that supports you doing this job.” Having worked at some of those companies, one of the conversations that I had with Richard when I first joined was, oh my gosh, what job ladder should I be on?
He and employee success were like, “Do you have to change ladders? This sounds like you could just stay on the user research ladder. I mean, you were doing that for the last 18 months.” I was like, “Oh, yeah, that works for me.” Here I am trying to in insert needless bureaucracy because I just expected it. I have met so many other people as well who they also identified the need and they just started doing the work. After a while then they pitched it to executives. And once you’ve got a portfolio of wins and you demonstrate that you are having impact, you are bringing value, then it’s just kind of the culture to be like, yeah, that makes sense. You should keep doing that.
Josh Birk:
Right.
Kathy Baxter:
I absolutely love that. I haven’t encountered that at any of the other big companies that I’ve worked at.
Josh Birk:
Yeah. I got to say, I completely agree. I’m reflecting back on a story of a former corporation who won’t go named that while I was working there, I wrote a really early version of what basically would become AJAV, asynchronous JavaScript, me and my partner in crime crated this real-time dashboard that would show whether servers were running up or down and we showed it to our management and they were like, “This is so amazing.” Then we got in trouble because it wasn’t our department, and it wasn’t our department to monitor servers. The team that is had that job got really angry with us, took the application down and replaced it with Lotus Notes.
Kathy Baxter:
Wow.
Josh Birk:
Yes. Bureaucracy can be a little detrimental. Okay. Going back, because I want to focus a little bit on the start of that story where Mark in 2016 is saying that we’re on a path to becoming an AI first company. What does that mean to you?
Kathy Baxter:
That meant that, or at least what it meant to me was that we were going to start focusing on how can we bring automation into all the things that we are doing? How do we start thinking about leveraging machine learning to make the tasks that everyone was already doing easier, faster, more efficient? For some people, that automatically leads to how can we replace all the humans? Of course, from an ethics standpoint, I’m like, oh, dear God, no. When I first started talking to customers about chat bots, a lot of them were like, yes, let’s replace all the customer service people with bots. First of all, that was way too advanced for the technology available at that time, but also just a lack of recognition for the value that humans will always bring to whatever the human interactions are.
There are many things that, yes, please automate this. Yes, please give me a bot to just reset my password or to update some thingy. There’s no need to sit on hold for 30 minutes to get this really simple task done or question answered. Right. I am happy to go to a chat bot for something like that. There are other things that you’ve got to have a human, because the question is so complex or the issue is so complex, you really need to have a genuine conversation with someone. There will always be some aspects where we are not going to be able to a hundred percent automate.
Josh Birk:
Got it. Let’s talk about the chat bot. We might have to frame this in terms of what’s machine learning. We can fill out the gaps little bit how this works, but the real question in my head right now is going back to Microsoft’s experiment gone horribly wrong, how do chat bots go bad? How do they go evil?
Kathy Baxter:
First and foremost is really the training data that you give it. A lot of language models or agents are trained on data that’s freely available on the web. That could be things like Twitter or Reddit or IMDB. Not surprising when you get toxicity spewing out of those agents, and with Tay, the problem was that the Twitter trolls were intentionally saying horrible things to it, trying to get it.
Josh Birk:
Oh, really?
Kathy Baxter:
Yes. It becomes, because machine learning is learning over time, the people were realizing that the more hateful, awful things it’s said to it, the more hateful, awful things it started saying back. There’s a lot of safeguards that you need to put into place. You need to ensure that the data that you are giving it is not full of hate and stereotypes and things like that. Sometimes there are different guardrails that you can put in place to prevent it from saying specific things or if you ask it a question. I remember there was a lot of work I read about in the early days with Alexa or the Google Home agent, where if you would ask it certain questions, then the designers, the conversational AI designers would have the response back, I’m sorry, that’s not a question, or that’s not polite. Oh, I’m sorry. I’m not comfortable with that question.
If you tried to continue pushing it, then it would stop responding because it didn’t want to gamify it. They didn’t want to reward the person on the other side of the speaker by continuing to see how far can I push this? Being really thoughtful in the conversational design about what it’s allowed to say. Now with Open AI, they have spent a lot of time to try and put some safeguards in place with their various GPT language models and now Chat GPT, which is the newest toy everybody is playing with.
Josh Birk:
Yes.
Kathy Baxter:
Unfortunately, some folks have found ways around some of the safeguards that they have put into place. This is always going to be a matter of whack-a-mole. We’re dealing with humans. There’s always going to be some matter of human beings that are going to try to break things. . It’s just the way it works. You always have to have a practice, you have to have a process in place where you put yourself in that mindset. You ask, how might somebody use this with intentional malice or alarming stupidity. There’s a reason why on the outside of the box of Preparation H, it says “For external use only.” You know at some point somebody said, “I ate this whole tube and it hasn’t helped at all.”
Then, also just how might somebody use this in orthogonal ways that never occurred to us? They’re not trying to break it. They’re not trying to do bad things. They just thought of a completely different way of using this that didn’t occur to you that could accidentally cause or unintentionally cause harm. All of those things have to be in place and it has to be constant monitoring.
Josh Birk:
Got it
Kathy Baxter:
Rather than just set and forget.
Josh Birk:
That sounds to me, it goes back to, it’s adjacent to what you’re saying about we can never really replace all the humans, and here we have an aspect because this feels like you’re fighting humans with humans. Apologies if that’s a violent metaphor, but the machine can’t consider what that human behavior was. You need a human co-pilot to say, “Oh, we have to put up some fresh guardrails.” Is that correct?
Kathy Baxter:
Yes. I was actually giving a talk this morning and someone in the audience asked the question of how do we teach machines to respect copyright law? My response was you don’t. You teach humans to respect copyright laws. It is what do you feed into the model? Do you have an entire process of opt-in consent of being able to check for copyright? That is not something that is going to occur to a machine or a language model to look for and decide whether or not it should use that in its process of creating new content.
Josh Birk:
Right. A chat bot gone terribly wrong, that’s a pretty obvious why we don’t want our AI customer assistance insulting our audience or anything like that. What are some of the bigger things? Why is this important to a corporation like Salesforce above and beyond doing the right thing and making sure our AIs are treating people like people?
Kathy Baxter:
The chat bots are a really easy example. So many people encounter them or your home voice assistance, whether it’s a speaker or your phone or even your car now. AI is pervasive in so many things that we are using today and so many decisions that impact our human rights, whether we realize it or not. Examples include facial recognition. There are some government assisted housing complexes where they have tried to add in facial recognition to be able to determine are the people coming into this building actual residents or are they someone else you would never see in an affluent high-end apartment complex in New York City facial recognition, validating if the people walking into those buildings were residents and allowed or not.
We see government using AI to help make what they hope are going to be more fair and balanced decisions of how to distribute social security or Medicaid benefits or to be able to detect fraud in IRS submissions and unemployment benefits. It’s understandable that our government is stretched thin. There’s way more work that has to be done than there are people. Applying AI to automate that makes sense. You can understand that. Also recognizing humans are biased. It’s very attractive to say, let’s use AI because of course, machines are neutral. The AI is only as neutral as the humans that make it. It is so imperative whether it’s healthcare or financial services or government services, whatever these are when you are using AI to stop and ask, not just can we do this, but should we do this? Who’s represented, who’s left out, who benefits, who pays? Being able to, again, constantly monitor to make sure that whatever safeguards you put in place, they actually work. Whatever you predicted might go wrong, that you can validate your assumptions and be able to catch the things that you couldn’t predict.
Josh Birk:
Gotcha. What’s an example of something Salesforce is using AI for right now that people, first of all might not consider, and second of all might not consider how it could go wrong.
Kathy Baxter:
We have really been incredibly intentional with the AI that we build. A lot of our AI compared to some other consumer companies can seem probably very vanilla, very boring within Service Cloud, case classification. Is this a refund question? Is this password reset? Gotcha. Is this a tax question? Then route that to the right agent. Obviously you want it to be accurate, but not a whole lot of ethical concern there.
Josh Birk:
Got it.
Kathy Baxter:
We’ve really been mindful and drawn a red line in some places. For example, we don’t allow Einstein Vision to be used for facial recognition.
Josh Birk:
Got it.
Kathy Baxter:
Got it. I can’t think of anything off the top of my head that would be a surprise that we are using AI in that particular use case. Got it. It might be more surprising of the cases where we’ve been very intentional and decided not to use AI.
Josh Birk:
Interesting. What does that policy look like for a company the size of Salesforce? How do you identify a mature policy that you’re like, yeah, that corporation’s keeping an eye on this.
Kathy Baxter:
It is such a good question. Scale is always an issue. There is a lot of advanced viewing that we do. We have something called our release roadmap planning summits, and that is when each of the clouds, they will share what their roadmap plans are for the next two to three releases. We very carefully pay attention to that and anything that looks like that might have a particular ethical concern, we will reach out to the team. In many cases, we have close relationships with the team anyway, so there’s really not anything that’s a surprise that comes out, but every so often we see something that we hadn’t expected.
We’ll reach out to the PM and say, Hey, tell me more. That looks interesting. Then have a whole conversation about should this exist? What are the particular risks along the way? It is a matter of really prioritizing not every product, not every model is going to need our attention. We’re not going to work with every single team, but during that scanning process, we are looking for anything and then doing outreach and conversations to make sure that we are engaged in the places where we really need to be engaged.
Josh Birk:
Got it. Got it. Okay. I’m kind of curious as to how that looks like from maybe the other side. Like developers. The data model is often the root of all things as the start of all things. Obviously if they’re working with something like opening Eye or something like that, these things would be first and foremost. What if it’s just a normal application where something might head down the road are there considerations for these kind of things that developers should think of on day zero?
Kathy Baxter:
This is where our education of employees, from the time that they are hired in new hire orientation, we communicate that ethics is everyone’s responsibility. Just like security is everyone’s responsibility. We have a very large security team, but we are still responsible to make sure that our passwords are secure, that we’re using VPN, we’re not clicking on dodgy links for a free gift certificate. Crazy little things about that. Similarly, ethics is everybody’s responsibility. Everybody should be asking should this exist? We need to give them a handful of tools and heuristics to keep in mind. We certainly are not expecting them to become ethicists, but being able to know what are sensitive variables. If you use those in your model, you might be introducing bias. A lot of these are probably very obvious to people, age, race, gender, religion. If those are factors you’re using in the model, there’s a good chance you’re introducing bias.
There’s also the concept of proxy variables. In the US, because of our history of redlining, zip code is often highly correlated with race. Zip code you don’t want to use. Interestingly, one of our very first AI features that we released or models was a sales lead scoring. When they started building in explainability, why is this lead a good lead? The number one predictor as to why somebody was a good lead was the first name John.
Josh Birk:
Really?
Kathy Baxter:
Yes. That is because based on the training data that we had used, John is the most popular first name in the US for men. There were way more men in this data set. The name John was a proxy for gender, and it can even be a proxy in some cases depending on the name for race or religion or country of origin. The team immediately removed first and last name as factors in the model. That was something that was communicated with other teams. We don’t use names in any of our models because frankly what your mama named you really has nothing to do with any of the predictions that we are trying to make. If all of a sudden we started making AI where what your mama named you matters, that would be a very, very interesting time. I don’t see that being related to CRM.
Josh Birk:
Gotcha. Are there resources that you would recommend for developers to start with?
Kathy Baxter:
We have a Trailhead called responsible creation of AI that we can provide a link in our show notes. We also have our inclusive design trail mix. In addition to thinking about things like bias and getting representative data sets, when we talk about responsible AI, there’s a lot more to it. We want to make sure that the AI is accessible to everyone. We want to think about using inclusive language. We want to make sure that it’s designed in a way that is explainable and trusted. We have quite a few different Trailheads on these different concepts. The inclusive use Trail mix is another really great resource to expand one’s understanding of what it means to design technology, not just AI, but all technology responsibly.
Josh Birk:
Gotcha. Kind of a bonus question. You had mentioned Chat GPT earlier, and it’s a really hot topic right now. Like you said, it’s the hot new toy everybody’s playing with. Just your opinion on maybe the medium future looks for that, should coders be afraid that their jobs are in trouble?
Kathy Baxter:
I get this question so often. I actually was speaking with a user researcher yesterday who unfortunately was laid off and he was looking for job openings and is not seeing much in the way of user researchers. He’s seeing lots for designers. This harkens back to the days of the dot-com bust, which I was there during the dot-com boom and the bust. As a user researcher, I remember that situation very well where companies wanted designers that could also test their own designs. Of course, that’s terrible. You do not want that. When companies are trying to do more with less this whole culture of and, you do this and that really becomes pervasive.
He was worried that these AI content generators could end up taking over the job of user researchers. I think for something like user research, this is not a task that can be automated. This requires human intuition, human collection to be able to understand context, how are humans being impacted, what are the real effects and then how might we mitigate it? I don’t think this is something that can ever be automated. Design on the other hand, we are seeing more and more articles where instead of hiring or using an illustrator that’s on staff to create all of the illustrations for your newsletter or whatever the publication is, they’re captioning at the bottom. This is stable diffusion or image gen or whatever else.
Josh Birk:
Right.
Kathy Baxter:
I do think there’s much more of a concern on the design side. We will likely see a lot more content generation for things like marketing text coming up with marketing campaigns and emails and things.
Josh Birk:
Interesting.
Kathy Baxter:
You still have to have a human that oversees that. Some folks might have seen and we can provide a link to the story, but KFC in Germany created a holiday calendar and used AI to automatically generate promotions for various holidays. There is one holiday that is related to the Holocaust, and they automatically generated and released a promotion for free fried chicken for this holiday.
Josh Birk:
Oh.
Kathy Baxter:
Incredibly insensitive.
Josh Birk:
That’s one of the cringing things I think on this show.
Kathy Baxter:
Yes. They of course apologize profusely and said, there is supposed to be a human that oversees every campaign before it’s launched. Don’t know how this happened, but again, there’s no way an AI knows what is appropriate, what is offensive. It can’t understand this particular holiday. You need a human to go through that holiday calendar and say, this is not a holiday we should be offering free chicken and remove that from your AI generated planning.
Josh Birk:
Right. Yeah. I wish I could attribute the source, but I just read it randomly I think on Twitter. The response to it was basically like, well, in order for the AI to run perfectly, the client has to describe what they want. I think our jobs are probably safe.
Kathy Baxter:
Yeah, there are so many cases where the value that humans provide, I think there will be times when it is undervalued until it goes wrong. Then those are the moments where you see the limits of automation and the importance of humans in the loop.
Josh Birk:
That’s our show. Now, before we go, I did ask after Kathy’s favorite non-technical hobby and well, it sounds like it could be a very handy one.
Kathy Baxter:
I live in the Bay Area and there is a place called The Crucible. You can do all kinds of crafts and making there from glass blowing glass fusion, nice woodworking, ceramics, bike repair, you name it, they do it. That is my treat to myself. I try to make sure that throughout the year I’m going there and I’m taking classes. I love anything related to glass and jewelry making. My next one is woodworking. That is my favorite non-tech hobby. My retirement, I’m going to have a shed in the back with a forge. I’m going to be making knives and blowing glass and all kinds of things. I’m just going to be the crazy lady in the back using fire and metal and all kinds of things.
Josh Birk:
I love it. In case of apocalypse, I might have to head over to your house.
Kathy Baxter:
You’re welcome to join me.
Josh Birk:
I want to thank Kathy for the great conversation and information. As always, I want to thank you for listening. Now, if you want to learn more about this show, head on over to developer.salesforce.com/podcast where you can join our community, hear the old episodes, to see the show notes, and have links to your favorite podcast service. Thanks again everybody, and I’ll talk to you next week.