- Working at Tencent AI
- Successful OpenSource projects he worked on
Tencent != 10¢
- From Internet to Wechat
- Basically Google for China + they managed to be good at social media and messaging
- They have a lot of data and they can probably use it, because there is no GDPR in China
Tencent AI
- 70 AI scientists + 300 app developers (== ML Engineers)
- The #1 in AI publications in China, yet still 1/10 of Google papers
GNES ‒ Generic Neural Elastic Search
-
One framework to train text, image and video search
- Should be used as TF, just everything deployed in a different place
- A bit like KuberFlow/AirFlow, but optimized for Search scenario
Premises
- Cloud native
- Semantic search (using DNN)
- End2end
Obstacles
- What is the distance metric for doc vectors?
- How to handle the difference between short / long texts (video, images)
The Idea
- Minimum information unit × minimum semantic unit × optimum semantic unit
- a word × a sentence × ???
- a pixel × a 64×64 patch × ???
- a pixel in 1 frame × a 2-3 sec shot × ???
- What are the optima?
- Do an experiment!
- => determined by ONE preprocessor
- Models change too quickly
- => plug them in as a Docker container
Specs
- Everything is its own microservice
- All of the components are defined by a YAML file => immutable code, just change the YAML
Questions
- Are you happy with the “All as microservice” design?
- Do you have any idea, what is the overhead?
- It adds overhead if the processing of each component is small
- => great for images or better video (just FFMPeg)
- Is the presentation somewhere?
- Can it be put into lambda functions?
- Not for now
- Not because of the loading
- How to address multiple languages – when preprocessing is completely different?
- It is your job to write a specific YAML file that preprocesses it
- How to handle huge models in Docker images
- Just put everything into the container
- What is the view in China on big OpenSource contributions?
- They do not get much support for OpenSource (a lot of papering, what to publish etc.)
- 30 % work the relationship making
- Challenges for Europe with regards to AI
- A lot of Chinese AI is focused on the consumer (you can test something on this milion of data and that milion of data)
- => not much 0 to 1 research, a lot of 1 to N research
- Don’t try to copy B2C of US and China
- Try to focus on the middle man, companies, which can change the world a lot, but are not “shiny” (no coverage)
- Upgrade the traditional industry! Keep Europe great still!