AI Reader Parser

 

what light through yonder window breaks and julia is the son of the Father known

as she’s a bright ball of gas

mundo someone son is julia is the star in the sky hello world welcome to sur geology in this episode we’re going to build an AI reader that is an AI that can analyze human language this type of task is considered natural language processing or NLP NLP is the art of solving engineering problems that need to analyze or generate natural

language txt we see it all around us

google need to do it to understand exactly what your search query means they can give you back relevant results twitter uses it to extract the top trending topics microsoft use it for in-car speech recognition and will be basically extremely dope because it deals with language Kurzweil once said that language is the key to a I a computer able to communicate

indistinguishably from a human would be true AI their 6500 known languages in

the world and each of them have their own rules for syntax and grammar some rules are easy like I before E except after C and some are based on intuition and there is no consistent use case so how do we write code to analyze language before the eighties NOP was mostly just a bunch of hand coded rules like if you see the word dance

translated to khuda aur forward and an ing label it as present tense well this

worked it was really tedious and there are million corner cases it’s just not a sustainable path to language understanding the way forward was and is machine learning so instead of hand coding the rules and a I learned the rules by analyzing large corpus or piece of text this is proven to be very useful and applying deep learning to NLP is currently the bleeding edge

so when thinking about what tool to demo in this video I was really torn between

an NLP api called apia I and Google’s newly released english parts are party make parts fit style apna I takes in an input query and returns an analysis of the text party is Google’s newly released english parts are both have similar functionality but i’m going to go with party because a it’s currently the most accurate parts

are in the world be if you built it into your app that’s one less networking call

you have to make which means you can parse tax offline and see building this party logic from the source allows you to have more granular control over the details of how you want text to be analyzed party was built using syntax net an NLP neural net framework for Google’s tensorflow machine learning library so we can use syntax net to build our own partner or we could use a

pre train parts or parts

yeah let’s do that once you parse your texts are a whole host of things you can do with it let’s try it out with our own example we’re going to build a simple Python at that uses parson make parfaits to analyze a command by a user and then repeat it back to them both for it is different we’ll begin by importing our dependencies then we’ll set up our

program to receive and store the user input the input X is the corporate will

be analyzing we can get an array of all the part of speech tags in the input text using the tagger function so what is part of speech tagging it’s assigning grammatical labels to each word in a given corpus in all those words we learned back in elementary school to take the phrase I saw her face and now i’m a believer if we tag each

word in that phrase individually without looking at the senses as a whole we

might tag saw as a certain bird which means this will be a quote from Leatherface but if we look at this work in the context of the sentence we realize that it’s a different for google train party by interpreting sentences from left to right for each word in the sentence and the words around it’s extracted a set of

features like the prefix and suffix put them into data blocks contaminated them

all together and send them to a feed forward neural net with lots of hidden layer which would then predict the probability distribution over set of possible POS tags and going in order from left to right was useful if they could use a previous words tag has a feature in the next word so what is a parse tree

what part of speech tagging isn’t enough there’s another part the meaning behind

some piece of text isn’t just the type of work that’s being used but also how that word relates to the rest of the sentence take the example phrase he fed her cat food there are three possibilities of what this phrase could meet number one he had a woman’s cat some food that’s the obvious one to us

intuitive humans but there’s also a number to be found a woman some food

that was intended for a cat or number three it somehow encourage some cat food to eat something the meaning of the sentence depends on the context of each worth the team use something called the head modifier construction to sort war dependencies is generated directed arcs between words like that and capping a direct object

the word that the sentence starts out fun process with an initial stack of

words the unprocessed segment is called the buffer at the parser encounters words as it moves from left to right it pushes words onto the stack then it can do one of two things you can either pop to ward off the stack attach the second to the first which will create a dependency are pointing to the left and push the first word back on

the stack or

create an arc . to the right and push the second word back on the stack this will keep repeating until the entire sentence is processed the system decides which way too . the are depending on the context I e previous POS tagging once that’s done it uses the sequence of decision to learn a classifier that will predict dependencies in a novel corporates it applies the softmax

function to each of the decisions which normalizes or adjusting to a common

scale and does is globally by summing up all the softmax scores in log space so the neural net is able to get probabilities for each possible decision and hear it called beam search help decide on the best one when predicting once we have our parse tree and parts of speech variables let’s store the root word and the dependent object into their own variable will call

us in an API to retrieve a synonym for the dependent objects then construct a

novel sentence that repeats the command that users enter it back to them in different wording looks like it works pretty well the scope of what you can do with this is so that you can use this text analysis to create a text summarizer or recognize the intent of a query or understand if a review is positive or negative or my personal favorite create a political debate back

checker link with more info below please subscribe for more ml videos for now

I’ve got to go pick the buffer overflow so thanks for watching