Harry Glorikian Interviews Massimo Buscema

EPISODE SUMMARY

Harry’s guest in this episode is Massimo Buscema, director of the Semeion Research Center in Rome, Italy, and a full professor at the University of Colorado at Denver. Buscema researches and consults internationally on the theory and applications of AI, artificial neural networks, and evolutionary algorithms. The conversation focuses on AI and its applications in healthcare, and how it can enhance what we can see and uncover what we cannot.

Harry Glorikian Interviews Paolo Massimo Buscema

Harry: Massimo, welcome to Moneyball Medicine.

Massimo Buscema: Thank you for this invitation.

Harry: So, Massimo,let’s start at the simplest level, tell me a little about the Semeon.

Massimo: Okay. Semeion is a private research centre in Artificial Intelligence, recognised legally by the Italian State (by the Ministry of Research of the Italian State).We have workedin Artificial Intelligence since 1985. It is a lot of time. For this reason, Semeon has an official connection with UCD – the Department of Mathematical and Statistical Sciences in Colorado. And we work together. But, give me the time to make a little overview about Artificial Intelligence, from our point of view. Because I am an applied mathematician and I think that behind the big news about Artificial Intelligence, Artificial Intelligenceis actually divided into three fields. The first one is the people who try to simulate the human brain to understand better how the human brain works. This is the simulation approach. The second approach is emulating the human brain to build more effective computers that are able to work in different ways to the human brain, but with more effectiveness. The third approach is my approach, and we will call it the “physical approach”. Using, and with inspiration of the human brain, we try to understand the invariant laws with which individual behaviours can be collective. That is very interesting because it analyses how single individuals are assembled and rearranged in a crowd, how single atoms shape molecules, how single ants make a swarm and so on. For this reason, Artificial Intelligence, from this point of view, is an intelligent way to understand what the human brain is unable to do. So, our approach, as a key word, is fusion. The human brain is very capablein understanding many facts, but it is completely blind when it has to understand other information. So, we try to build systemsthat are able to learn and to understand the hidden information embedded in the same landscape that is not evident for a human being. Let me make an example, when radiologist analyzes a computer tomography of a lung, the radiologist is able to see something, but we discovered that more than 70% of the key information in the picture of the computer tomography is completely hidden for the radiologist. So, we built up a system that is able to discover this hidden information using only one picture. Usually radiologist pretends that the mathematical tools that supports him, analyze an image as an eye of an expert uses to do. And the classic math approach has tried to work in this way: mathematicians havebuilt systems able to simulate some features of human eye. However, we discovered a way to put this philosophy upside down. I tried to understand how each pixel of the image sees itself in the picture, so I will give to each pixel a bunch of equations, through which it can talk to its neighbouring pixels, and this dynamic gossip between pixels at the end will show me the image from the point of view of the pixels, and lots of new pieces of information will appear, information that was completely hidden from the radiologist. And we have evidence of this with many scientific publications that we didwith this new patent named ACM (Active Connection Matrix). In other words, in our systems each image is transformed into a matrix of active connections, where the pixels are agents that modify dynamically themselves according to global geometry of the assigned image.

Harry: But Massimo, the AI system is a pattern recognition process. We always talk about it understanding or whatever, but it has no human understanding. I always try to remind myself that is has no human capabilities when we use words like understanding or intelligence, right? It is a program that matches a pattern. It makes a correlation and then gives the human an output of what patterns it recognises. So, when you are talking about pixels looking at other pixels next to each other in an image, essentially is what you are saying, communicating to each other, or building upon themselves for the cumulative.What is the benefit of that when the answer is coming to a radiologist? What is the unseen picture that is emerging?

Massimo: Two real examples that I published in the peer review papers. First one, a snapshot of a digital angiography. The radiologist would look at the snapshot and say ‘okay, everything is okay here, I use contrast mediahere, so I can see the flow of the blood, more of less’.Now we put the same snapshot, that is the same image, into our system. The pixels start to interact with each other and the image, the original image, modifies dynamically, and at the end you have a new shape of the outer area and you see a strange occlusion. One that was completely absent from the source image. At this point the radiologistsaid ‘oh that is impossible, I saw the outer area personally with my angiography and there was no occlusion in this area’ and we said to him ‘okay, but you have to check the person because you usually make a check after three days, after the surgery operation, so you have the opportunity to control what we are showing’.

He did it, and he discovered exactly this kind of occlusion in the outer area, exactly in that part. At this point, we made many independent applications similar of this case. For example, we have a computer tomography of a slice of a lung, and there is, in this lung, a very small ground glass opacity. And this is a very small lesion, but usually the radiologist says ‘okay, come back after one month and we will repeat this examination’. In order to examine if this lesion is moving or not. This person unfortunately forgot it, and he came back after three years.

It is a clinical case that was reported in literature. And, the original lesion was transformed in the time into a big aggressive tumour. At this point we used only the original image three years before, where you have a very small lesion, and the pixels working with each other transformed the small lesion into a big shape, that paradoxically matched completely with the image from three years previous.

Harry: What you are saying is that this technology is predictive?

Massimo: Yes, but let me say that this prediction is a prediction in space but not in time. Because, the system makes no probabilistic calculations with the image from three years before. It tries to code only the hidden information that was already present in the image but was not evident for the radiologist. Because the tumour was still working at that time. Because, everyone knows that in the lung, before the tumour builds up, first the lymphatic vessels. And lymphatic vessels are similar to the bloodbut with very low amount of proteins. So, with allX- Rays, they result black on black,with little variations of black that are not possible for the human eye to detect. There is information in these areas, and this scan understand this.

Harry: Okay, but Massimo, you are saying that the level of programming, this is not a trivial level of data points that the system needs to valuate simultaneously. You have to give it the data to look for this and that and the implication of this equals this next portion. So, the number of calculations, the number of data points it’s looking at simultaneously is not trivial. So, programming this is not trivial.

Massimo: Oh, it is easier than people can imagine because I make each pixel, a pixel of any image, become an agent. Just like it became a new one. It tries to connect, tries to talk with the new ones close to him, and each one does the same, and the image transforms itself dynamically in the time.You need only one image, not many images, to be learned. Because, in this case I choose an example of how to discover hidden information in an image.

If you want to talk about deep learning, I have many things to say about deep learning, applied to many other fields. The first question is that whendeep learning is really deep and when it is a fat learning. Because the distinction is fundamental, when you think about deep learning usually you have a big narrow network with a big input and many layers. And each layer reads the previous one, up until the end where you have the output. In this case you have one difficulty which is you don’t know what has happened in the middle and you pose this problem- how can this system explain why it takes one decision and not another one? But the big thing is that our brain is not done with one big network, with many layers. Our brain is made of many many different small networks, working together without knowingly working together. If I sometimes call the actual deep learning fat learning, its because you have the input, the first layer reads the input, the second layer reads the first layer, the third layer reads the second and so on.

Harry: Yes.

Massimo: That is not deep.

Harry: Yes. I understand what you are saying, but you are talking about deep learning networks communicating with each other simultaneously to come up with an answer.

Massimo: Yes, we built an ecology of systems. Many different systems working from the same input, and, in this way, they try to combine the same inputs from different points of view to arrive to a global solution. In this way you have the same object – the data – seen from a different point of view because every network makes different errors and if you have different errors, you can combine different errors to increase the current answer. Because the human brain teaches us that we liveif we make mistakes. If we are not able to make mistakes, we have no chance to live. And each one of us (the fingerprint of each of us) is the style of our mistakes. Each one of us makes different mistakes, and these are our fingerprints.

Harry: yes.

Massimo: If we are able to create an ecology of different systems of deep learning, working simultaneously on the same data set, and then filtered after by another network whose task is to decide in each case which networks in this case is which are right, and which are wrong, then have a real deep learning this way.

Harry: So, what is the difference between that and reinforcement learning?

Massimo: Ah, reinforcement learning is the only way to reinforce the current answer and inhibit the wrong answer during the learning process. And that is very useful for robotics. Also, in this case we can use reinforcement learning and correlation learning, if you want. But they are the same family of mathematical problems.

Harry: Yeah, I think about reinforcement learning like my mother, she makes sure that I do exactly what she needs me to do.

Massimo: Yes, but let me say this. When you are a baby, you learn a lot of things without a teacher. You learn to walk, you learn to talk, you learn to be astute, you learn how to lie and so on. And you are very good in those steps. But, when you go to school, the day before you go, you have a lot of “why”, “why this” and theseare linguistics, individualised to a specific function of the brain that says, ‘they need to tell me why’. This is typical of a baby, before school, around 4 or 5 years old. Okay, when they go to school, after the first day, there is a teacher that starts to say “this is a”, “this is b”, “this is c” and so on. And so, you havereinforcement learning, it is a type of supervised learning, okay?

Harry: Yeah.

Massimo: After this experience the baby stops saying, “tell me why?”, that’s the problem.

Harry: So, we shouldn’t go to school anymore? That’s fantastic, I used to say this to my mother.

Massimo: No, we have to change the way in which we teach. Because, every time that you force a person to learn something, giving them the standard in a compulsory way, the brain works differently. I will say, not supervised learning is the best learning in order to learn deeply.

Harry: Okay.

Massimo: If an artificial system can learn this way, it can be very productive. More productive.

Harry: So, let’s bring that back to healthcare, because making this connection between the two is critical. So, you have worked in Alzheimer’s, autism, genetics, colon cancer, cardiac and many more areas, right? So, where do you see the biggest impact, in healthcare, that can improve the outcome of a patient while decreasing cost and impacting quality?

Massimo: Okay, let me make an example that could be more representative – autism. To diagnose anautist, we will analyse the EEG sample.We are able to understand (or the system can understand), by lookingforroughly two minutes at the EEG, if the person belongsto the autistic spectrum disease or not. This is the advantage of the system. Because, you know that we have no therapy for autism, or no effective therapy. But, if we can know if a person is autistic at the age of two years old,we are able to create effective therapy for them, because at this age the brain is very plastic. But, if we try to do the same when they are 5 or 6 years old, there is no chance that our therapy can work with the same effectiveness. At this point, if we are able to take the EEG of a one or two year old, and if we are able to predict if they will become autistic or not, we have a chance to take care of this person from an early age, and at this time many of them can have their ability increased because their brain plasticity is very high at this age. And this can have a big impact for health costs because one autistic person has to be helped and supported from the family and the state for all their lives. It is a big cost, and for the family it is only a big worry because the father and mother are usually scared to die because they don’t know who will support their child afterwards. And, to be able to predict autism when they are two years old, is fundamental for the quality of the life, for the ethic viewpoint, for the medicine and for the health costs.

Harry: Yeah, I see what you are talking about and it seems like if you could input more into that, in other words, if you have genomics information, if you had some physical input. If you could build more into this neuro network it would be able to subtype the autism, better than just diagnose it.

Massimo: For example, you can collect the urine when a baby is one month old and if we have the metabolomics analysis of the of the urine,we can predict,or we try to do it, if this baby will become autistic or not. Putting together all this data that you can collect when the child is very young, you can create effective therapy for this kind of people.

Harry: So, lets step back for a second and talk about the field overall. I see many many models emerging right now as, you know, particularly here in the US, companies are using software as a service, if you think about Amazon and their cloud process, you are looking at google which will not give away or sell their TensorFlow chip, but they will let you access it through thesoftware’s service.

Massimo: We use this, but we also use our own software where we program directly.

Harry: And then you see companies like Nvidia which are selling their chips for people to use internally or maybe a combination of all of them, but then it’s interesting because that’s almost like a brain in the cloud right? But then I am seeing companies like Apple who just made an acquisition where they want the software to be closer to the person using it. Possibly for privacy, possibly for speed, etc. But where do you see the field going? Do you see AI being local, or being removed from the immediate area?

Massimo: You have posed many questions that need an answer. So, the first one, I think that most of the applications that you are talking about, by Google or by Amazon, are based on deep learning and on big data. I think that, from my experience, big data is another big issue. The problem for learning is to reduce big data because big data by themselves are full of garbage. The main goal is to reduce big data to a small database. When ahuman beingis working on a small experience, they are able to transfer this small experience into many other different experiences. So, the main work that we have in Artificial Intelligence is how to capture, from big data, the small and thick data sets. This last oneisthe only ones very significant for a real deep learning.

Secondly, when you use big data, you can build something that could be impressive for common people and could also be, from an actual point of view, critical. But, from a scientific point of view, if we really want to use Artificial Intelligence to help people to take decisions, it is not a big help. We need to create small data sets. So, each system needsto be able to intelligently grasp small data sets from the big data set and then work and learn from the small significant data sets to make a conclusion. And the key word for this task is fusion. The key word is the fusion between the human knowledge, the human decision, and the second opinion provided by the AI system. So, I see that most of the activity of the Artificial Intelligence could be local, it could be on the cloud, this is not a difference. The difference is, from my view, the capability of the system to find out the key information from the big mass of data. Remember that the key information is not the most frequent information, most of the key information is light, hidden information because it many times, from a mathematical point of view, representshighest harmonics. You know that, in the theory of a physical network, the weak link is more important than the big link. And it is the same for learning.

Harry: Yeah it is the same for strategy. Where do you see this field going? I know you are doing work all over the world, but how do you see it being implemented in the field of healthcare? And do you have any examples? Maybe in the work you ae doing with some of the hospital systems in Denver or maybe in Italy itself.

Massimo: Yeah, first of all, I see that they could be very useful in the biological explanation of many diseases, especially rare diseases. Because, about rare diseases, big pharmaceutical companies have not an economic interest. Consequently a person with a rare disease oftenreceives a wrong diagnosis. These people lose a lot of time and the disease becomes chronic. So, rare diseases are very important, and we are building systems able to explain how the variables that you collect on the body interact with each other in a specific way, to give an explanation from a biological point of view about these diseases. This is from a research point of view. From an applicative point of view, you can work because the system learns to distinguish these diseases from other diseases in order to understand, in a blind way, why you have the disease A, and the disease B, and the disease C.This could be a second opinion to add the opinion of the clinician. And I am sure that this fusion increases the effectiveness of the diagnosis.

Harry: Switching gears for a minute, you are like a super mathematician. So, for all the rest of us, who are not the super mathematicians and/or specially trained. Where do you see the software going, do you see it being brought to a level that the average person would be able to create a program, or an algorithm, or a small piece of software that would be able to do this work? Or do you really have to understand the deep math behind it to be able to create it? Because we are not graduating enough people.

Massimo: Yes of course. Let me say that, I think that what the people, professionals and the normal people, have to know, is how to use the philosophy of these systems. They don’t need to know the detail – the mathematical details. And our software is builtso that every person with a basic education, obviously, can use it. And sometimes it can happen that a centre with we cooperate learns to uses our software better than we are able to do. And they are not specialists in mathematics – that is fundamental. But let me add also that the philosophy of this system is very easy.Imagine that in the medical fields, the true fingerprints of a person are their biological output. I mean that, if you want to understand the health of a person, the output is more evident,it is the first step to understand how the health is. So, we are what we throw out. And the same from a criminal point of view. If you want to understand a criminal’s activity, you don’t need to make an inspection in the apartment, but you have to analyse the garbage basket of this person. Because, from what they throw out, you understand who they are. I mean that the philosophy is that the output – what we throw out – is the key feature to understand our problem, our philosophy and so on. That’s fundamental that people know. And, another thing that’s important to understand, for the normal user of the system, is that learning is a question of time. Just like the baby, the baby sees the same thing many times, and then they are able to make generalization. So, one short answer is not believable. So, fundamentally, to use this system, the only need is to be open to understand the philosophical basis of these systems and not the mathematical details.

Harry: Massimo, grazie mille. Thankyou

Massimo: You’re welcome.

Harry: It was wonderful, and I look forward to keeping in touch and learning more about what you are doing at the centre. And I am sure we will have many conversations as this field evolves over time.

Massimo: Thank you for this invitation and for the very nice conversation. Thank you.