Lishan Qin – IMA Documentation

March 22, 2021

UXD Documentation：留学通

Inspiration

In recent years, there appears to be a steady rise in the number of students that want to go abroad for master or phd programs to further their studies. In order to design a product, a user experience that caters to the needs of these users, in the past few weeks, our group has conducted market research, interviews, questionnaires to narrow down the pain points of our potential users, so that we could decide on our core task and work out a prototype.

Market Research

In order to learn about the market size, we looked into the data of the number of Chinese international students, the market value of the graduate students agencies and the increase in the number of international Chinese students each year from different data sources such as Ministry of Education, 白皮书，and Chinabaogao. We also searched for the anxiety and pains of the graduate applicants on social media like Zhihu and Douban. We find that graduate students often find themselves overwhelmed and anxious during the application process.

Questionnaire

Based on the results of previous market research, our questionnaire puts forward the main ways to obtain information as a participant, the reasons for choosing to study abroad, and whether relevant software and intermediary services have been used. Through the feedback of 76 participants, we found that most of them were college students or graduate students. They pay more attention to the quality of studying abroad and enrich their own life experience.

However, in China, the channels for obtaining information are not professional. Generally, information is collected through friends around and searching on the Internet without the help of relevant practitioners. Only one third of the people have sought advice from professionals. They prefer to have a professional institution to provide more detailed information, materials and self-assessment required by the school.

Interview

For the interview, we wanted to get the widest possible range of perspectives from our potential audience. Therefore, we conducted six interviews from an undergraduate student, a university professor, and a graduate student. Each interview was roughly 30 minutes and contained 21 questions. After the six interviews, we used dovetail to find trends in what users wanted and also potential pain points. We compiled all of these keywords into a word cloud and were able to get a visual representation of the most repeated points that our interviewees brought up. These were a lack of resources, too much time spent on searching, and a feeling of being overwhelmed and lost.

Pain points

Searching for graduate program information is too time-consuming, inefficient, and overwhelming for applicants.

Persona

Cai Xiaoqing

Cai is our hardworking undergraduate student persona. She is a lower middle class student who dreams of going to graduate school and fulfilling her dreams of doing biology research. However, she is concerned about the expensive price of using an agency to help her look for schools and put together an application. Her own efforts to find suitable schools for her needs have yielded few results, especially as she is not yet sure where she wants to specialize in the Biology field.

Lee Lei

Lee is a programmer working for an Internet company. Under the pressure of getting an advanced education and buying a house and car at the same time, he wanted a promotion to earn a higher income. For him, getting a master’s degree is the best way. However, restricted by the working system and pressure of 996, Lee did not have spare time to find the information of relevant colleges andecific project tutors. Moreover, he had been out of school for a long time, so he needed an app to help him complete the application and provide specific information When Lee used our app, he quickly screened the programs that were relevant to the algorithm for his further study, and selected several universities. At the same time, he also chose the consulting service, so that he could easily connect with professionals and quickly enter the application process.

Kris Wang

Kris is a junior student in China. He comes from a pretty rich family, so he doesn’t really worry about future career. However, his parents insist that he need to take a graduate program. In fact, Kris’s parents have involved in almost every big choices of him.This time, Kris wants to complete this choice himself. But his academic performance is ordinary, and if he wants to go to a good graduate program and prove himself, studying abroad is probably the only choice. But he doesn’t really know much about the academic systems or application process abroad, and needs to seek help.
Kris wants the whole application process to be easy, and he found this app that could save him much work. In fact, what he needs to do is simply spending 15 minutes a day choosing he likes among the programs which the app recommends for him. Kris believes this is the start of making own choice in life.

Core Task

Provide graduate program applicants with accurate, comprehensive, and accessible information to help them find the program best suited to themselves.

Paper Prototype

First Draft Prototype

User test

We got a lot of feedback including:

“Needs more information sections, such as sample courses, application deadlines, and alumni status.”

“The interface is a bit confusing. Why does the map and the program keywords search need to be separate?”

“The logo looks too big on the homepage”

Second Draft Prototype

Focused Solution

From the user test and class feedback, we realize it could be distracting for users if we just simply add multiple functions to the app. So we decided to really focus on solving one pain point, and after analysis and discussion, we concentrated on the hard searching process and tried to provide help.

Map Search

The map search function allows users to find a school by interacting with a map. They can tap on an area on the map and it will zoom in and display a list of schools in said area as well as a map marker for each corresponding school. Users can tap on the map markers to find out which markers correspond to what schools.

Keyword Search

Keyword search would be the most powerful search method, as we would give all programs multiply tags to match users’ keywords. With these tags, the search would be more accurate and the users and better compare different programs. Also we would provide various filters to satisfy different search needs.

Detailed Information

In order to solve the pain point, we would offer very detailed information about programs. We have prepared information including requirements, courses, professors and etc. that are urgently needed by applicants according to our previous research.

Easy Access

One feature of our app design is that you can easily get access to the information. So besides simple search, you can also simply review the information once you add the program to your list.

Smart push

In order to save user’s research time, we would provide smart recommendations for them. So you found one good program, you found them all. The recommendation would be based on the programs in users’ list and users’ information, so the more you use, the recommendation would be more accurate.

December 14, 2019

Week 14 Final Writing assignment – The AI Daily Prophet” – Lishan Qin

Background

When I was young, I was fascinated by the magic world created by J.K. Rowling in Harry Potter. She has created so many bizarre objects in that world of magic that I still find very remarkable today. “The Daily Prophet”, a form of newspaper in the Harry Potter world, is the main inspiration of my final project. “The Daily Prophet” is a series of printed newspaper that contains magic which allows the image on the printed paper to appear as if it’s moving. It inspires me to create an interactive newspaper with an “AI editor” where not only the images on the newspaper will update every second according to the video captured by the webcam, but also the passage on it will change according to the image. Thus, it will appear as if the newspaper is writing the reports on its own. In short, for the final project, I created a “magic” newspaper with the help of an “AI editor”.

Motivation

I’ve always found that the newspaper is a form of artwork. Even though today there are fewer and fewer people who actually take the effort to read the news in a printed newspaper, the design of such an elegant means to spread news and information still fascinates me, which is the first reason why I wanted to develop a newspaper related project. The second reason is that I find that today, even with the development of social media which allows new information to be spread almost every moment and every second, it still requires human people behind the screen to type, collect and then post that news. However, it occurs to me that if there is an AI editor that could document, write and edit the news on the newspaper for us, the real-time capability of spreading information of the newspaper would be even better. Thus, I want to create an interactive self-edited newspaper that asks an AI to write the news about the action of the people it sees by generating sentences on their own. Because the news-paper can do self-editing and writing, which seems kind of magical, it fits well with the Harry Potter Daily Prophet magic theme I wanted to do.

Methodology

The machine learning techniques behind this project is Style Transfer and image caption. In order to make the webcam video appear on the newspaper naturally, I trained several style transfer models on colfax to change the style of the webcam video so that it will fit the color and theme of the Daily Prophet background image I chose. At first, I thought that I can only train the style transfer model one by one, but later I found that I can actually train several models at the same time if I create new checkpoints and models, which can save a lot of time. Then, in order to make the newspaper to generate words on its own according to the webcam image, I used the im2txt model from Runway that Aven recommended for me. The sample code Aven shared with me allows p5 to sent the canvas data to Runway and then Runway can do image caption to these data, generate sentences as results and then send the results back to p5. Even though the model isn’t always giving out the most accurate result, I still find both the im2txt model and Runway to be super amazing and helpful. The technical part of this project is the combination of the style transfer model and the im2txt model.

Experiment

When I first tried to connect the style transfer model I got and the im2txt model together, the outcome wasn’t very promising at all. Even though Runway did generate new sentences and sent them to p5, the words it gave aren’t that relevant to the webcam image. After I looked into the reason behind it, I found that it was because the image data Runway received was the data of the whole canvas. I tried to send only the webcam video data to Runway but it seemed that the video value doesn’t really support the “.toDataURL” function. So I decided to place the newspaper background outside of the canvas and only have the webcam image on the canvas so the only data Runway processed will be the date of the webcam. This improved the outcome of the image caption to a large extent and the words all seem more relevant to the users’ webcam image. However, the results still weren’t exactly perfect because the data of the webcam the Runway received was the style transferred image, and the image caption the model is able to do is still a bit limited, it still often makes mistake when describing the image.

Social Impact

This project is designed to be an interactive and fun means to show people how the AI technology can make “magic” come true in an entertaining way. When I present this project in IMA show, many people excitedly asked me about the techniques behind this project after they interacted with it. As I introduced them about the models I used, they all showed great interest in trying it themselves to learn to apply AI techniques to design artistic or entertaining products. Thus, I hope this project can show people the potential of AI in various filed and encourage them to learn more about the application and techniques of machine learning techniques. In addition, I also think this is a nice try at using AI to be the self-editor of a newspaper.

Demo

https://drive.google.com/file/d/1P2aXo6lBLXIUWQU_GJb2ahmrlTw9xVG6/view?usp=sharing

Further Development

The one feedback I got both from the critics during presentation and guests on the IMA show is that I could try to make the output appear to be more obvious. During the IMA show, I sometimes needed to point it out to people what is changing other than the image on the newspaper. Thus, there are two things I can try to do to improve this project. First, I can try Moon’s idea to make the caption the headline of the newspaper and do a bit of adjustment to the results and made it like “WOW!!! THIS IS A WOMAN SITTING ON A CHAIR!!!”, to make the change more obvious. Second, I can try to do a live writing long paragraph where each new sentence will be put after the previous sentences and make it look like the newspaper is writing a paragraph one sentence by one sentence. I think both changes will look pretty cool, and can make the users interact with this project better.

December 14, 2019

MLNI-Final Project Documentation —— Lishan Qin

My Project:

My final project is called “Dr. Manhattan”. It’s basically an application of combining different machine learning techniques together to create an interactive art project. The idea is to create a sense of post-explosion chaos vibe within the interface where users will see themselves looking as if they’re composed of nothing but the hydrogen atoms, and their body parts are scattered into pieces and floating in disorder. Then a screenshot will be taken and the users will be asked to find a way to rebuild that screenshot image to prove that they can take control over the chaos and try to find a way back to the past. The meaning behind this project is to show the fragility of human lives, the will power of mankind to seek control in the constantly changing chaotic world, and the futility of the attempt to dwell upon the past. Because however likely the image the user manipulates is to the screenshot image, there will always be a slight difference.

Inspiration:

My inspiration for this project is the superhero character “Dr. Manhattan” from the comics Watchman. In comics, this character experienced a tragic physics explosion where his body was scattered into atomic pieces. However, he eventually managed to rebuild himself atom by atom according to the basic physics law and his experience with the atoms allowed him to acquire the power of seeing his past, present, and future simultaneously. The design of this character is rather philosophical in the sense that a man who can see through the chaos of the universe and rebuild himself from atoms following the universe’s rule can’t really do anything about the chaos of the world itself. This character gives me the idea of designing a project where the users can see themselves in a very chaotic and disordered form and giving them the sense that they can actually control the chaos and return to the organized world by giving them an almost impossible task to rebuild the past moment in a constantly changing chaotic world. I also named this project after the character.

The Process of Developing with Machine Learning Techniques

In order to create a post-explosion chaotic vibe for the interface to fit the theme of the project, I first used Colfax to train several style transfer models. I tried several different images as the input and sample for the model to train on to see the outputs and eventually decided to use the one with the interstellar theme, for it gives the output give the webcam image look like everything is composed of blue dots, like hydrogen atoms. This process was relatively smooth. I uploaded the train.sh file and sample image to Colfax and then downloaded the transferred results. Each training took about 20 hours.

However, when I tried to use BodyPix manipulate the pixels of the style transferred images, I met with a lot of difficulties. First, I found that the output of the style transferred model can’t use the function “.loadPixels()”. After I consulted professor Moon, he told me that the output of the model was actually an Html element, thus I need to first turn these outputs into a pixel image.

Then, I found that even though the transferred output becomes a pixel image, I still failed to effectively manipulate them with pixel iteration. The segmentation often appears repeatedly on the canvas. For instance, when I tried to manipulate the position of the left face pixels, the canvas showed four left faces of mine, which covers half of the canvas, and is not really what I was aiming for. When I asked professor Moon about this, he pointed out that since the pixel length of the pixel image after style transferred and the pixel length of the webcam image is different. Thus, I need to first learn the pixel lengths of each image and mapped the index accordingly so that I can manipulate them correctly.

However, when I managed to manipulate the pixels of the segmented body parts away from the body and make them float around the canvas, it appeared that the original part was still on the original body, even if I tried to cover those with different pixels. Professor Moon later told me that I should get another value “displayImg” to store the pixel of the “img”, and display this value.

After I used BodyPix to segment the body parts, the image on canvas showed that the person’s face is cut into half and appear in various places and the arms, shoulders, and hands are all tearing away from the body and also showing up in different places. Then, I utilized KNN to compare the current image with the screenshot image, so that the user can know how well they are rebuilding the screenshot. During the presentation, I used the save() function which can get a screenshot of the current frame and then downloaded it. However, Moon later taught me that getting a screenshot and reloading it could be done simply with the function get(). In addition, I also added the bgm to make the project seems more related to the chaos and scattering theme.

Project Demo

https://drive.google.com/file/d/1yX9cjRdzE9uTAuK2sBTgmlVMSkhRXnAp/view?usp=sharing

Deficiencies & Future Development

As is pointed out by the critics during the presentation, more ways of manipulating the style transferred body pixels remain to be explored. Currently, in order to make the scattered images recognizable as body parts, I took the whole arm connecting the elbow and hand apart from the body together. However, such a way of manipulating these pixels makes the scattering effect less obvious. In the future, I can try to separate the arm and hand too and manipulate those segmentations better to make them both recognizable as body parts and also more artistic. In addition, as Tristan pointed out, the way I utilized KNN in my project didn’t really contribute to the presentation of the idea I wanted to spread. Illustrating the idea of struggling in the chaos and dwelling on the past by asking the users to imitate one screenshot seems a little indirect. Thus, I should try documenting the users’ actual coordination of their body parts and let them interact with the images on canvas to trigger more different output, such as the rebuilding of part of the body, or even more serious scattering.

November 26, 2019

Week 12 Assignment: Document Final Concept —— Lishan Qin

Background

When I was young, I was fascinated by the magic world created by J.K. Rowling in Harry Potter. She has created so many bizarre objects in that world of magic that I still find very remarkable today. “The Daily Prophet”, a form of newspaper in the Harry Potter world, is the main inspiration of my final project. “The Daily Prophet” is a series of printed newspaper that contains magic which allows the image on the printed paper to appear as if it’s moving. It inspires me to create an interactive newspaper with an “AI editor” where not only the images on the newspaper will update every second according to the video captured by the webcam, but also the passage on it will change according to the image. In my final project, I will use Style Transfer to make the users’ face appear on the newspaper and utilize im2txt to change the words of the passages on the newspaper according to what the user is doing. I will build an interactive newspaper that constantly reports the users’ action.

Motivation

Even with the development of social medias which allow new information to be spread almost every moment and every second, it still requires human people behind the screen to type, collect and then post those news. However, if there is an AI editor that could document, write and edit the news on the newspaper for us, the real-time capability of spreading information of the newspaper would be even better. Thus, I want to create an interactive self-edited newspaper that asks an AI to write the news about the action of the people it sees by generating sentences on their own.

Reference

I’ll refer to the im2txt model on github https://github.com/runwayml/p5js/tree/master/im2txt here to create the video caption. This model will generate sentences according to the object and action the webcam video caught. I will run this model on the runway and then it will sent the result of the caption to html so that I can manipulate the outcome. Since some of the captions aren’t that accurate, I still need to find some ways to improve on that.

November 19, 2019

Week 11 Assignment: Explore BigGAN or deepDream —— (Lishan Qin)

For this week’s assignment, I played with DeepDreaming a little to generate images. I ran the model with the image below and changed the settings to see how each setting influences the output image. The difficulty I met when running the model is that when I first run the deep_dream.py file the error message “RuntimeError: Variable += value not supported.” showed up. I later changed the code in the file into “loss = loss+coeff * K.sum(K.square(x[:, :, 2: -2, 2: -2])) / scaling” and it worked. But I still don’t really know why it couldn’t run when the code is ” lose += ….” . But the program did begin to generate output after I changed that. The results are as follows.

The input image:

The output image when no setting is changed:

The output when I changed all “feature settings” into 2:

(The changes seem to be smaller than the original image.)

The output when I changed step into 0.1 from 0.01 and left anything else unchanged:

Overall this was a super fun experience. The image it generated has a really powerful and interesting feeling. Even though I haven’t seen how I can apply this technology in my final project, still I find this the deep dreaming is a very powerful technology.