Document Classification

You’ll have to think about how to train the machine. What kind of data did you include in your training dataset and why? What other kind of data could have been helpful but maybe you couldn’t get in the short-term/for free? Your group may, in some cases, search for photograph sets. One possibility to get large data sets is to convert YouTubes into clips. Did your model work well for what you wanted? In what instances might your model not work very well? Include the link to your project.

What Kind of Data

We started off by training our model to learn the difference between cats and dogs, we then moved on to trying to determine whether a person was happy or sad by searching the web for images of people who were experiencing happiness and sadness both mild and extreme.

What other kind of Data could be Helpful

Shortly after we started testing our faces and seeing if the machine learned to tell if we were happy and sad, we ran into a couple factors. Race, hair length, gender, are all things that could make the machine bias. Because of this, the data we would need to perfect this experiment was more photos to pull from. Young people, old people, varying skin tones, varying facial and regular hair length, as well as gender in both happy and sad. This would be what is needed to make our results more accurate, as when the majority of our ‘happy’ images had facial hair, we showed up as sad.

Did our model work

Yes, eventually we added enough photos for it to work, however it was minimal and could easily be perfected with more range in types of people per section.