Disney+ Review

As the parent of a six year old and a lover of both Star Wars and the Avengers, signing up for Disney+ was a no-brainer. I jumped on an early deal and pre-paid for 3 years of the service so we’re on board whether it’s good or not. The only hiccup I’ve had so far came on launch day when the app was overloaded, but since then the reliability has been great.

As I sat on the couch last Friday watching Jungle Book with the family, I kept wondering what my 10 year old self would have thought if it could have seen me watching Jungle Book on a 10 foot screen in my own house without any tapes or discs in a player as it is controlled via my phone.

I’ve seen some comments about various movies that aren’t on the service, but it’s a treat to scroll through the list of what IS on the service. You know how you scroll through Netflix or Hulu and you’ve never heard of most of it? Not so with Disney+. It’s hit after hit after hit. No more frustrating “Disney vault”. It’s all there at your fingertips.

Over the past few years, I’ve felt the Disney brand rising up the list in my head. They’re becoming synonymous with a high quality but sometimes pricey product. Thankfully Disney+ only gets the first part of that. The cost is $6.99/month with cheaper options if you pay ahead. That’s crazy low when you compare it to other services.

Disney+ gets two thumbs up from me!

Strata 2019 San Francisco

My company was nice enough to send me down to San Francisco last week to attend the Strata Data Conference. If there’s a bigger conference in my field of data engineering/science/analysis, I don’t know what it is.

I attended a big data conference four years ago, but going to Strata was a huge step up both in terms of the quality of the event planning and in the quality of the talks. I came away with a stronger vision about things I want our team at work to accomplish and how we can make a bigger effect on our business group.

I skipped all the social events surrounding the conference, but I filled both days wither every talk I could cram into my schedule. A couple were total duds, but there were a lot of great ones from Netflix, Lyft, Uber, Intuit and others.

Aside from the conference itself, it was strange to be traveling alone. I did spend one evening in a movie theater watching Captain Marvel, but otherwise I mostly hung out in my room. I felt guilty about temporarily forcing Tyla into single parent mode and leaving my team at work short-handed, so I spent a lot of my free time working on the laptop and trying to make good use of my time.

My hotel was right next to Moscone West where the conference was held and that was fantastic. I was able to get from my room to a talk in about 5 minutes. That let me hustle back to the room even when we had ~45 minute breaks to get away from the crowds and recharge a bit. It’s surprising how tiring it is to sit on your rear end and listen to talks all day. I felt like my brain was very full!

It was a great trip, and while it’s not something that I need to do every year, I hope I can go back in 3-4 years. Thank you Tyla for holding down the fort while I took this trip!

Patent Application

Azure Data Explorer has made a dramatic impact on my career. It has inspired a whole new breed of data engineering and it feels like a wide open playground for ideas and innovation. There were so many new ideas and patterns floating around in my head that I decided to attempt the patent process (through work) for one of them. I’ve never been through it before and it was interesting to see all the different levels of scrutiny and checks that go into it before you even sit down with a lawyer to start drafting the application.

I’m thrilled to announce that I’ve completed all of that work and my patent application has been submitted! Unfortunately… I’ve been advised not to share the details of it yet. After about 18 months, the US Patent Office will publish the application. At that point it will be public information on their site but it will still take another 2-3 years from that point for them to review it and either approve it or ask for some more information.

So I guess the point of this post is to say that I’m really excited about applying for my first patent. Even if it doesn’t get approved, it’s neat to see how the process works and it has me thinking whether or not other ideas are patentable too.

Analyzing Water Data in Azure Data Explorer

One of my favorite systems at work officially launched a couple weeks ago as Azure Data Explorer (internally called Kusto). I’ve been doing some blogging for their team on their Tech Community site. You can see all my posts on my profile page. This post will use Azure Data Explorer too but I thought it fit better on this blog.

A year or two ago, our local water company replaced all of the meters with digital, cellular meters. I immediately asked if that meant we’d get access to more data and they said it was coming in the future. The future is now! If you happen to live in Woodinville, you can get connected with these instructions.

The site is nice and lets you see charts, but by now you probably know that I love collecting data about random things so I immediately tried to figure out how to download the raw data. The only download directly supported form their site is the bi-monthly usage from the bills, but from the charts, I could see that hourly data was available somewhere. A little spelunking in the Chrome dev tools revealed the right REST endpoint to call to get a big JSON array full of the water usage for every hour in the last ~11 months.

I pulled that into Azure Data Explorer and started querying to see what I could learn. This first chart shows the median water usage by three hour chunks of the day. Tyla and I usually both shower in the morning so it makes sense that 6-9am has the heaviest usage.

| summarize 
    by Hour=bin(hourofday(Timestamp), 3), bin(Timestamp, 1d)
| summarize percentile(sum_Gallons, 50) by Hour
| render columnchart  with (title = 'Median Water Usage by 3 Hour Bin', legend = hidden)

I feel like there’s probably a better way to do write the next query, but this works. It’s the cumulative usage throughout each month. The four lines at the top of the chart are the summer months when I’m using the irrigation in the yard. The lines that drop off at the end of the month are because I ran the x axis all the way from 1 to 31 for every month so months don’t have enough data, but it still conveys the general idea. It’s interesting how similar all the non-watering months are.

    | summarize Gallons=sum(Gallons) by bin(Timestamp, 1d)
    | extend Month=monthofyear(Timestamp), Day = dayofmonth(Timestamp)
    // Original data had some missing rows
    datatable(Timestamp:datetime, Gallons:long, Month:long, Day:long)
        datetime(2018-11-26T00:00:00.0000000Z), 0, 11, 26, 
        datetime(2018-11-27T00:00:00.0000000Z), 0, 11, 27, 
| order by Timestamp asc
| serialize MonthlyWater=row_cumsum(Gallons, Month != prev(Month))
| project Month, Day, MonthlyWater
| make-series sum(MonthlyWater) on Day from 1 to 32 step 1 by Month
| render linechart with  (ycolumns = sum_MonthlyWater, series = Day, Month, legend=hidden, title='Cumulative Gallons By Month')

The data is in 10 gallon increments so it’s not super precise but it’s a LOT better than the two month resolution I had previously. I’m excited to play around with this data and see if we can start decreasing our usage.

Along these same lines, I heard that the local power company is starting to install power meters with Zigbee connectivity so there’s a chance that I’ll be able to start getting more insight into my power consumption in a similar fashion…

Security Questions

Many sites still use “security” questions to help you retrieve your account. When you first create an account, they ask you things like “What was the name of your first pet?” and “What color was your first car?” Even if you’re doing well and using a long, random, unique password for that site, you probably just destroyed your security by answering those questions. I’m pretty sure I could answer most of those questions for some of my friends. This is a common route for hackers too, especially with all the information available on social media sites.

Pro-tip: you can lie. It’s ok. I already use Last Pass to create and store random, unique, strong passwords for every single account so I just generate more random characters for these security questions. In Last Pass, there’s a notes field for every account that you store so I drop the questions and answers right in that note field so I have them for later if I need to retrieve my account via the security questions.

Yes, I changed these after taking the screenshot.