Successfully reported this slideshow.
Your SlideShare is downloading. ×

Nerd sniping myself into a rabbit hole... Streaming online audio to a Sonos speaker

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Loading in …3
×

Check these out next

1 of 64 Ad

Nerd sniping myself into a rabbit hole... Streaming online audio to a Sonos speaker

Download to read offline

After buying a set of Sonos-compatible speakers at IKEA, I was disappointed there's no support for playing audio from a popular video streaming service. They stream Internet radio, podcasts and what not. Well, not that service I want it to play!
Determined - and not knowing how deep the rabbit hole would be - I ventured on a trip that included network sniffing on my access point, learning about UPnP and running a web server on my phone (without knowing how to write anything Android), learning how MP4 audio is packaged (and has to be re-packaged). This ultimately resulted in an Android app for personal use, which does what I initially wanted: play audio from that popular video streaming service on Sonos.
Join me for this story about an adventure that has no practical use, probably violates Terms of Service, but was fun to build!

After buying a set of Sonos-compatible speakers at IKEA, I was disappointed there's no support for playing audio from a popular video streaming service. They stream Internet radio, podcasts and what not. Well, not that service I want it to play!
Determined - and not knowing how deep the rabbit hole would be - I ventured on a trip that included network sniffing on my access point, learning about UPnP and running a web server on my phone (without knowing how to write anything Android), learning how MP4 audio is packaged (and has to be re-packaged). This ultimately resulted in an Android app for personal use, which does what I initially wanted: play audio from that popular video streaming service on Sonos.
Join me for this story about an adventure that has no practical use, probably violates Terms of Service, but was fun to build!

Advertisement
Advertisement

More Related Content

Slideshows for you (20)

Similar to Nerd sniping myself into a rabbit hole... Streaming online audio to a Sonos speaker (20)

Advertisement

More from Maarten Balliauw (20)

Recently uploaded (20)

Advertisement

Nerd sniping myself into a rabbit hole... Streaming online audio to a Sonos speaker

  1. 1. Nerd sniping myself into a rabbit hole... Streaming online audio to a Sonos speaker Maarten Balliauw @maartenballiauw
  2. 2. Disclaimer I will share bits of source code where they matter, but will not be sharing the full application. I have built this application for personal and learning use, and I do not intend to share it. Don’t ask, the answer is no.
  3. 3. Living room speakers
  4. 4. January 2020 “Let’s replace our old speakers with new and shiny!” Requirements: “Smart” speakers that can stream from the Internet 2 for living room, 1-2 for home office
  5. 5. Now what… Searched online for solutions… …all I found was excuses. Legal, patents, … Don’t really care as a consumer!
  6. 6. https://support.sonos.com/s/article/79?language=en_US
  7. 7. Nerd Sniping 1. The act of presenting someone, often a mathematician/physicist with a time consuming problem or challenge (often impossible to solve or complete) in the hopes of it appealing to a person's obsessive tendencies. Urban Dictionary And https://xkcd.com/356
  8. 8. Research
  9. 9. What would I need? 1. Connect to speakers 2. Get MP4 URL of online video 3. Send URL to speakers 4. Enjoy music!
  10. 10. https://python-soco.com/
  11. 11. Connect to speakers Connect to one speaker Play an MP3 from webserver This seems very, very promising! #!/usr/bin/env python from soco import SoCo if __name__ == '__main__’: sonos = SoCo('192.168.1.102’) sonos.play_uri('http://host/file.mp3') https://python-soco.com/
  12. 12. Get MP4 URL of online video Video metadata endpoint, used by web player Returns urlencoded data about video (with JSON sprinkled in) https://www.youtube.com/watch?v=-zJoP2qPgTg  https://www.youtube.com/get_video_info?video_id=-zJoP2qPgTg
  13. 13. Get MP4 URL of online video { "responseContext":{ }, "playabilityStatus":{ }, "streamingData":{ "expiresInSeconds":"21540", "formats":[ { "itag":18, "url":"https://r3---sn-uxaxoxu-cg0k.googlevideo.com/videoplayback?expire=1599574779u0026ei=mz5XX6-fKdOIgQeBp6PgDwu0026 "mimeType":"video/mp4;+codecs="avc1.42001E,+mp4a.40.2"", "bitrate":234221, "width":640, "height":360, "lastModified":"1586173729658545", "contentLength":"92057863", "quality":"medium",
  14. 14. Get MP4 URL of online video More reading by Alexey Golub https://tyrrrz.me/blog/reverse-engineering-youtube Signed videos Video player JS code contains decryption routine as JavaScript Need to evaluate that to be able to access video (or Regex the cipher) Too much hassle to write manually! There exist scripts & libraries in many programing languages In summary: we have the MP4 URL now.
  15. 15. Send URL to speakers #!/usr/bin/env python from soco import SoCo if __name__ == '__main__’: sonos = SoCo('192.168.1.102’) sonos.play_uri('http://host/file.mp4') https://python-soco.com/ Expected: Actual:
  16. 16. Side track: speaker webserver Anything useful to find? http://192.168.1.123:1400/status http://192.168.1.123:1400/support/review http://192.168.1.123:1400/tools.htm https://bsteiner.info/articles/hidden-sonos-interface
  17. 17. Troubleshooting
  18. 18. 💡 Maybe it’s that SoCo library! “Because maybe 65 contributors have it wrong!” The official application can send a stream to the speakers... ...can I listen on the network and see what the request looks like?
  19. 19. Sniffing the network WireShark https://www.wireshark.org/ Sniff traffic that passes your computer’s network adapter Traffic does not pass my computer :-/ Phone on wifi, speaker on wifi, computer on wifi – huh? Turns out access point does just send all traffic to all devices 💡💡 Unifi access point is *nix tcpdump there?
  20. 20. Sniffing the network 🤓 On my Windows box, connected to wired network Run Ubuntu SSH into access point and run tcpdump Pipe data back to Windows Access point IP Capture ethernet side From/to my phone IP
  21. 21. Your fancy wifi speaker uses SOAP
  22. 22. SOAP payload <?xml version="1.0" encoding="UTF-8"?> <s:envelope xmlns:s="http://schemas.xmlsoap.org/soap/envelope/" s:encodingstyle="http://schema <s:body> <u:setavtransporturi xmlns:u="urn:schemas-upnp-org:service:AVTransport:1"> <instanceid>0</instanceid> <currenturi>x-rincon-mp3radio://host/media.mp3</currenturi> <currenturimetadata> <DIDL-Lite xmlns="urn:schemas-upnp-org:metadata-1-0/DIDL-Lite/" xmlns:dc="http <item id="R:0/0/6" parentID="R:0/0" restricted="true"> <dc:title>Title here</dc:title> <upnp:class>object.item.audioItem.audioBroadcast</upnp:class> <upnp:albumArtURI>https://host/art.jpg</upnp:albumArtURI> <r:description>Description here</r:description> <desc id="cdudn" nameSpace="urn:schemas-rinconnetworks-com:metadata-1- </item> </DIDL-Lite> </currenturimetadata> </u:setavtransporturi> </s:body> </s:envelope>
  23. 23. Replaying SOAP payload Tried MP3 URLs and MP4 URLs MP3 worked, MP4 did not The SoCo library did not have any issues... Searched around for DIDL-Lite in payload Seems speakers use good old UPnP http://www.upnp.org/schemas/av/didl-lite-v2.xsd
  24. 24. 💡 Maybe it’s the MP4 format! https://support.sonos.com/s/article/79?language=en_US
  25. 25. Our (potential) options... Download MP4, push it to speaker as a local file or Proxy MP4 and do on-the-fly transcoding to MP3 Send MP3 URL as “Internet Radio” or Investigate MP4 and see if they indeed use AAC Send AAC URL as “Internet Radio”
  26. 26. Containers
  27. 27. MP4 MPEG-4 Part 14 or MP4 is a digital multimedia container format most commonly used to store video and audio, but it can also be used to store other data such as subtitles and still images. (…) allows streaming (…) Wikipedia Can we extract this to a separate file? MP4 file Header Video 1 Video N Audio 1 Audio N Subtitles MP4 file (optimized for streaming) Header Video 1 (short) Audio 1 (short) Video N (short) Audio N (short)
  28. 28. FFMpeg to the rescue! “A complete, cross-platform solution to record, convert and stream audio and video.” Swiss army knife for video/audio formats. ffmpeg -i original.mp4 -c:a copy output-aac.m4a Extracts the audio track from MP4 container Use SoCo to send file to speakers. https://ffmpeg.org/ Expected: Actual:
  29. 29. deadf00d “How I hacked Sonos and YouTube in the same day.” https://www.deadf00d.com/post/how-i-hacked-sonos-and-youtube-the-same-day.html @deadf0od - https://twitter.com/deadf0od “HEY, kAn 1 Dm J00? w0rK1n' 0N 51M1Lar 7h1n' AnD wE M19h7 8E a8le 70 HElP EaCH 07heR.” ... “It’s AAC, but in ADTS format. Each atom needs a header in every frame!”
  30. 30. MP4 to AAC with ADTS ffmpeg -i original.mp4 -acodec copy -f adts -vn output-adts.aac Extracts the audio track from MP4 container Adds ADTS headers Use SoCo to send file to speakers. Expected: Actual:
  31. 31. Making it an app
  32. 32. Let the app building start! 1. Connect to speakers ✅ 2. Get MP4 URL of online video ✅ 3. (new) Extract MP4 audio track to ADTS ✅ 4. Send URL to speakers ✅ 5. Enjoy music!
  33. 33. Architecture YouTube app Share https://youtube.com/watch? v=-zJoP2qPgTg YouTube.com Sonos speaker Main MP4 to ADTS MP4 audio Reverse proxy https://<ip>/audio.adts
  34. 34. Technology choices Native app? Xamarin? Flutter? Jetpack Compose? 🤓♂️ Installed Android Studio as it’s similar to JetBrains Rider, IntelliJ IDEA, ...
  35. 35. Getting started… Template Start with empty and add things? Start with <any> and remove things? Language Java? Kotlin?
  36. 36. Research… Android-specific Activity (main screen to handle everything) Intent (ACTION_SEND to receive data from others) Libraries (or code) YouTube metadata extractor (get audio URL) Sonos communication library (discover speakers, send URL to speakers) Webserver (reverse proxy) Something to do the MP4 to AAC (ADTS) conversion
  37. 37. It all starts with a manifest! <?xml version="1.0" encoding="utf-8"?> <manifest xmlns:android="http://schemas.android.com/apk/res/android" package="be.maartenballiauw.android.sonostube"> <application android:allowBackup="true" android:icon="@mipmap/ic_launcher" android:label="@string/app_name" android:roundIcon="@mipmap/ic_launcher_round" android:usesCleartextTraffic="true" android:theme="@style/AppTheme"> <activity android:name=".MainActivity"> <intent-filter> <action android:name="android.intent.action.MAIN" /> <category android:name="android.intent.category.LAUNCHER" /> </intent-filter> <intent-filter> <action android:name="android.intent.action.SEND" /> <category android:name="android.intent.category.DEFAULT" /> <data android:scheme="https" android:host="youtu.be" android:mimeType="text/*"/> <data android:scheme="https" android:host="youtube.com" android:mimeType="text/*"/> <data android:scheme="https" android:host="www.youtube.com" android:mimeType="text/*"/> </intent-filter> </activity> </application> </manifest>
  38. 38. And checking the intent… class MainActivity : CoroutineScope, AppCompatActivity() { override fun onCreate(savedInstanceState: Bundle?) { super.onCreate(savedInstanceState) setContentView(R.layout.activity_main) val url = intent?.extras?.getString(Intent.EXTRA_TEXT)
  39. 39. SSDP Simple Device Discovery Protocol Used to discover printers, routers, Sonos, ... Send UDP datagram as multicast/broadcast M-SEARCH * HTTP/1.1 HOST: 239.255.255.250:1900 MAN: "ssdp:discover" MX: 1 ST: urn:schemas-upnp-org:device:ZonePlayer:1 Devices send back HTTP response as UDP Supported services, endpoint URL, ... https://en.wikipedia.org/wiki/Simple_Service_Discovery_Protocol https://github.com/vmichalak/ssdp-client
  40. 40. Android – Running webserver Prevent server stop when application is closed or device goes to sleep <service android:name=".RunEmbeddedWebServerService" android:enabled="true" android:exported="true" /> class RunEmbeddedWebServerService : CoroutineScope, Service() { private val server = embeddedServer(Netty, 36362) { routing { get("/{videoId}.mp4") { /* ... */ } } } override fun onStartCommand(...): Int { server.start(wait = false) return START_NOT_STICKY } override fun onDestroy() { server.stop(0, 1000) super.onDestroy() } https://developer.android.com/guide/components/services#Types-of-services
  41. 41. MP4 to ADTS – FFMpeg?
  42. 42. MP4, ADTS, Atoms, Boxes https://www.deadf00d.com/post/how-i-hacked-sonos-and-youtube-the-same-day.html
  43. 43. MP4, ADTS, Atoms, Boxes MP4 with AAC Audio (simplified) ADTS with AAC Audio (simplified) moof “Hey, 25 samples coming. 128kbps, 2 channel audio!” mdat Raw, binary data for 25 samples. moof “Hey, 12 samples coming. 128kbps, 2 channel audio!” mdat Raw, binary data for 12 samples. Header “40 bytes coming, 2 ch” + 40 bytes for 1 sample Header “36 bytes coming, 2 ch” + 36 bytes for 1 sample Header “42 bytes coming, 2 ch” + 42 bytes for 1 sample Header “40 bytes coming, 2 ch” + 40 bytes for 1 sample
  44. 44. MP4, ADTS, Atoms, Boxes Had to manually write conversion logic... Open MP4 stream (https://github.com/sannies/mp4parser) Foreach moof, read metadata Foreach sample, write ADTS header, write sample
  45. 45. Exploring the code demo
  46. 46. In summary…
  47. 47. In summary… Enjoy music with an app ✅ Connect to speakers ✅ Get MP4 URL of online video ✅ Extract MP4 audio track to ADTS ✅ Send URL to speakers ✅
  48. 48. “Do you use it often?”
  49. 49. In summary… Learned a lot of random things along the way ✅ There is so much knowledge out there! Talk to people (thanks, deadF00d!) You can build anything! Even if it seems impossible at first!
  50. 50. Thank you! https://blog.maartenballiauw.be @maartenballiauw

Editor's Notes

  • https://pixabay.com
  • Wife and I built house, set of old PC speakers. Time for a replacement!
    Mostly play streaming music, Spoify, TuneIn, SoundCloud, YouTube, ...
    Requirements: smart, so that works
    2 for living room, 1 or 2 for home office, and multiroom would be even better!
    Unfortunately, it being January, everything out there was full-price. Not that we are cheap, but in terms of value proposition we do find 2k for some speakers a bit skewed.
  • But then, we passed an IKEA store, and saw... Symfonisk.
    They are a colaboration with Sonos. Essentially, Sonos speakers, rumour has that it’s the same hardware as that Sonos from earlier, at 1/5 the price.
    Same software, 100% compatible with that Sonos stuff.
    And as you can see on the picture, I can even put my glasses on top of it.
    Listened to it in the store, and seemed fine. DEAL! We walked out with those speakers.
  • So we took them home (click)
    Read the manual (click)
    Followed all instructions, so we took a seat (click)
    And installed, the app!
  • It was brilliant. We were able to tune a local radio station on TuneIn, push a playlist from Spotify to those speakers, and listen downstairs and upstairs.
    We were almost in love with this new setup. SO GOOD.
  • There’s this video streaming site that lets me cast videos to my TV, and very often there is good music to be found as well.
    This is where the honeymoon phase with our new speakers ended.
    There is no support to cast video, or at least the audio of a video, to our smart speakers. NOT VERY SMART!
  • Now what? And yes this slide looks bad. But this was our feeling!
    Searched online, but all I could fine were excuses. These two companies are fighting patents, and generally not playing nice together.
    I don’t care as a consumer!

    But no way around it, this is what it was going to be.
  • In one of my searches, I did find an app which seemed to sort of do what I was after.
    Except, yet another party to give permissions to. Why can’t I just use the “Share” button in that other app?
  • But then I found this page on the Sonos website.
    Supported audio formats: MP4.
    And that streaming website is MP4.
    How hard can it be to push the video/audio URL to the device and be done with it?
  • Mention colleague usually snipes, although I have become proficient at sniping myself. Whih is not ideal.
  • Puting those steps in an app should be easy once the individual steps work.
  • So a few Google searches later, I found this Python library, SoCo.
    It supports writing Python, or use command line to do things like discover speakers, change volume, play music, ...
    This seemed great! So I went of and installed a Python environment.
  • That did not work...
    Actual is wrong, it did say “pop” about a second after a send.
  • HUH? Now what?
  • Check some Sonos web UI, do some pinging/traceroute/...
  • Irony: finding a license free picture of a rabbit hole was quite the rabbit hole in itself.
  • Narrator: They don’t have it wrong.
  • Narrator: They don’t have it wrong.
  • ssh admin@192.168.1.216 "tcpdump -s0 -U -n -w - -i eth0 host 192.168.1.195" > /mnt/c/Users/maart/Desktop/capture
  • Find from/to that matches
  • Follow TCP stream, and see payload!
  • Nice! Pass URL, title, description and album art!
  • The SoCo library did not have any issues, it seems. Who would have thought.
    UPnP! Not sure what that knowledge brings, but does mean there migh be more documentation out there.
  • MP4 does not support streaming... Only local library.
    Should I sniff payloads to se what is sent if I play MP4 from my phone library?
    Seems like good fallbak in case needed, but ideally don’t want to download the full video first, then upload to the speaker. I want close to instant!

    Maybe I could transcode MP4 to MP3 on the fly? Rabbit hole is deep enough as it is...

    AAC does seem to work for Internet radio. And AAC is the audio compression format used in many MP4 files... Could this be an option?
  • Let’s talk a bit about containers.
  • Explain container format. Analogy: a ZIP file (but it’s not a ZIP file).
  • Frustration. But, we’re now so deep in the rabbit hole, this SHOULD work, right? RIGHT????
    What do developers do when something does not work? Google!
    All sites I found had “visited” color, damned! But on page 25 of some search I did, I found somthing...
  • I found a hacker! Who was investigating the same stuff... Since then, he did elaborate on the entire process, but back then he was around the same stage as I was in this investigation.
    “hey, can i dm you? working on similar thing and we might be able to help each other.”
    We startd chatting, and at some point he says to me:
    “It’s AAC, but in ADTS format. Each atom needs a header in every frame!”
  • WAT?
  • All we need now is an app!
  • Also, I know NOTHING about Android development
    Went wth Android Studio, at least I know the general workings of the IDE, as it’s the same base IDE as IntelliJ and Rider.
  • Started with empty activity. It’s debatable, but I prefer clean templates that I can add incremental things to, as opposed to full-blown templates that I have no idea what they are doing.
    Could have picked Java, but chose Kotlin. It’s the de-facto Android language nowadays, and it’s very similar to C# - my “mother tongue” in programming languages, so to speak.
    Also fully compatible with Java, so can use any libraries out there and whn needed, even mix languages in one project.
  • NOW WHAT?
    Decided to start with the most important thing...
  • Behold, the UI design! This is also the final design of the app.
  • Activity already present, added the intent filter to accept URL data with https and something that looks like YouTube.
  • Based on a null check, we can do other things. So with that out of the way, we can start building!
    Explain coroutine scope, it’s to enable async/await like features in Kotlin for this class.
  • Went to packages tab (package search plugin), and started adding random things that matched what I was after.
    Ktor I knew from my colleague, who is building it out, and is a webserver/client framework.
  • Turns out there are many packages that can help run FFMpeg, even on android!
    They are wrappers, and “sort of” run command line.
    Which means we would have to download MP4 first, convert to ADTS, then stream to device.
    Workable for 2-3 minute songs, but for a 1h30 DJ set that is 250 MB, it’s not ideal. We’re also messing up temporary storage on the device and all.
    I want close to zero delay between playing and the stream starting!
  • Explain MP4 is a set of boxes, lots of pointers. But we will try to better visualize on the next slide.
  • Explain logic of injecting ADTS header for each frame
  • Start with AndroidManifest.xml, mention more than we saw on slides. For example, permissions.
    Trial and error, based on what I was doing Android would throw an exception telling me which permission I was missing.
    MainActivity - onCreate
    Extract YouTube URL
    If found, discover Sonos devices - discoverSonosDevices()
    Explain withContext(Dispatchers.IO) { - run this on IO thread
    Discover using multicast
    Discover using broadcast
    Look at code for those
    If devices found, prompt using AlertDialog builder
    When device selected, trySetupStreamingFor
    Get YouTube metadata
    Generate video URL on local IP address (wifi only)
    Extract album art and all
    device.playUri(....) using that SOAP request from earlier
    onCreate also started a webserver, which runs as a foreground service
    Needed to make sure our server keeps running even if we’re doing other stuff on our phone
    RunEmbeddedWebServerService
    get("/{videoId}.mp4")
    Extract metadata again, get MP4 audio URL
    Run MP4AacToAdtsAacConverter on the MP4 URL stream, using AacAdtsWriter
    MP4AacToAdtsAacConverter is something I had to cook up. Lots of trial and error, and deadF00d helped with insights
    Parse boxes, read header, then skip to samples and push those out with ADTS header each time
    AacAdtsWriter - go through the byteshifting...
  • I recorded a demo back in january, but I think I look a bit tired there. So lt’s do that aga
  • All we need now is an app!
  • All we need now is an app!
  • All we need now is an app!

×