HomeНаука и техникаRelated VideosMore From: Macgyver

Transfer Learning - Use Inception V3 to Solve Any ML Problem (Tempo Detection)

62 ratings | 2675 views
This tutorial teaches you how to use Google's Inception v3 model to solve machine learning problems across various domains not just image classification. Specifically we will tackle a machine learning problem which is focused around audio (STEM) files. Instead of using the audio clips stored as normalized arrays of integers we instead pass the 10 second plots of the clips to the inception model performing transfer learning on the final layer and generating a very accurate and reliable predictor of the tempo of an audio file. Google Colab Notebook https://colab.research.google.com/drive/1-gXcvRzEqu3E5lWVLYaet7EP6A_bFzmD #tensorflow #machinelearning
Html code for embedding videos on your blog
Text Comments (17)
Luis Cunha (15 days ago)
So fun!
Macgyver (14 days ago)
Jasper Nuyens (1 month ago)
Inspirational, thank you so much! We are including this into our AI course in Belgium. Always welcome for a beer :-D http://www.linuxbe.com/neuralnetworks.html
Macgyver (1 month ago)
Very cool, thank you!
Ramy Hussein (1 month ago)
Very nice idea. Could you please share the code?
Macgyver (1 month ago)
No problem!
Ramy Hussein (1 month ago)
+Macgyver Thanks a million! Really appreciate it!
Macgyver (1 month ago)
https://colab.research.google.com/drive/1-gXcvRzEqu3E5lWVLYaet7EP6A_bFzmD -- I really need to clean it up a bit.
Yinghao Hu (2 months ago)
I had once thought about this, but you actually implemented that. That's great. Is it possible to apply it to a wider area of the voice recognition ?
Yinghao Hu (1 month ago)
https://github.com/Erickrus/voice_vgg/ You can find some sample data there. I use fft to translate them into the image every 2000 positions in the wav file. And the layout is accumulation of the positive/real part of the transformed array. The image shows some differences between each sound unit, and the almost the same if a long sound is pronounced
Yinghao Hu (1 month ago)
If I have time, I'm will try this and with a FFT preprocessor. to see if it works or not.
Macgyver (2 months ago)
I’m not sure I haven’t dived too deep into voice recognition, my guess would be no. Tempo is relatively easy because it can spot the frequency of beats and generate a classifier based on that. Voice has much further nuance I believe.
Steve Fox (5 months ago)
Nice - a novel approach..
Macgyver (4 months ago)
Falguni Das (7 months ago)
Great !
ronzohar (9 months ago)
Very nice, however your validation method seems lacking. Would be much better to use a different clip for validation. Also, you did not specify what are the results on the valifation/test set.
Macgyver (9 months ago)
Agreed there is a lot of room for improvement. But it was the same validation method using keras directly and again we could not do better than 35%, so it still proves a good approach.

Would you like to comment?

Join YouTube for a free account, or sign in if you are already a member.