Do You Need A Residential Data Hub?

Data is essential for effective decision-making - even at home.
Residential Data Hubs: A Necessary Element @ Home

With more and more devices running in every home, it is becoming increasingly important to collect and manage all of the data that is available. Most people have no idea just how much data is currently being collected in their homes. But as the future arrives, almost every home will need to aggregate and assess data in order to make informed decisions and take informed actions. When that time arrives for you, you will need a “plug and play” residential data hub. Such devices will become an instrumental part of transforming your household into an efficient information processing system.

Currently, data is collected on your utility usage (e.g., electricity, water, Internet data usage, thermostat settings, etc). But few people realize that future homes will be collecting enormous amounts of data. We (at the Olsen residence and at Lobo Strategies) have exploited many of the new technologies that are part of the Internet of Things (IoT). Through this experience, it is apparent just how much data is now available. We are collecting data about where out family and team members are located. We are collecting data on the physical environment throughout our buildings – including temperature and occupancy. We are collecting information on the internal and external network resources being used by “the team.” And the amount of data being collected today will be dwarfed by the amount data that will be collected in the next few years.

The Necessity Of Residential Data Hubs

Over the past six months, we have been assembling a huge portfolio of data sources.

  • We use our DNS server logs and firewall logs to collects access-related data.
  • The Home Assistant platform collects data about all of our IoT devices.  [Note: In the past month, we’ve begun consolidating all of our IoT data into a TICK platform.]
  • Starting this week, we are now using router data to optimize bandwidth consumption.

While it is possible to manage each of these sources, it is taking quite a bit of “integration” (measured in many labor hours) to assemble and analyze this data. But we are now taking steps to assemble all of this data for easy analysis and decision-making

Consolidating Router Data

Our ISP put us in a box: they offered us an Internet “data only” package at a seriously reduced price. But buried within the contract were express limits on bandwidth.  [Note: Our recent experience has taught us that our current ISP is not a partner; they are simply a service provider. Indeed, we have learned that we will treat them as such in the future.] Due to their onerous actions, we are now on a needed content diet. And as of the beginning of the week, we have taken the needed steps to stay within the “hidden” limits that our ISP imposed.

Fortunately, our network architect (i.e., our beloved CTO) found the root cause of our excessive usage. He noted the recent changes approved by the premise CAB (i.e., our CTO’s beloved wife). And then he correlated this with the DNS log data that identified a likely source of our excess usage. This solved the immediate problem. But what about the irreversible corrective action?

And as of yesterday, we’ve also taken the steps needed for ongoing traffic analysis.

  1. We’ve exploited our premise network decisions. We normally use residential-grade equipment in our remote locations. In candor, the hardware is comparable to its pricier, enterprise brethren. But the software has always suffered. Fortunately, we’ve used DD-WRT in any premise location. By doing this, we had a platform that we could build upon.
  2. The network team deployed remote access tools (i.e., ssh and samba) to all of our premise routers.
  3. A solid-state disk drive was formatted and then added to the router’s USB 3.0 port. [Note: We decided to use a non-journaled filesystem to limit excessive read/writes of the journal itself.]
  4. Once the hardware was installed, we deployed YAMon on the premise router.
  5. After configuring the router and YAMon software, we began long-term data collection.

Next Steps

While the new network data collection is very necessary, it is not a solution to the larger problem. Specifically, it is adding yet another data source (i.e., YADS). So what is now needed is a real nexus for all of the disparate data sources. We truly need a residential data hub. I need to stitch together the DNS data, the router data, and the IoT data into a single, consolidated system with robust out-of-the-box analysis tools.  

I wonder if it is time to build just such a tool – as well as launch the services that go along with the product.

Home Assistant Portal: TNG

Over the past few months, I have spent much of my spare time deepening my home automation proficiency.  And most of that time has been spent understanding and tailoring Home Assistant. But as of this week, I am finally at a point where I am excited to share the launch of my Home Assistant portal. 

Overview

Some of you may not be familiar with Home Assistant (HA). So let me spend one paragraph outlining the product. HA is an open source “home” automation hub. As such, it can turn your lights on and off, manage your thermostat, open/close your garage door (and window blinds). And it can manage your presence within (and around) your home. And it works with thousands of in-home devices. It provides an extensive automation engine so that you can script countless events that occur throughout your home.  It securely integrates with key cloud services (like Amazon Alexa and Google Assistant). Finally, it is highly extensible – with a huge assortment of add-ons available to manage practically anything.

Meeting Project Goals

Today, I finished my conversion to the new user interface (UI). While there have been many ways to access the content within HA before now, the latest UI (code-named Lovelace) make it possible to create a highly customized user experience. And coupled with the theme engine baked into the original UI (i.e., the ‘frontend’), it is possible to make a beautiful portal to meet your home automation needs.

In addition to controlling all of the IoT (i.e., Internet of Things) devices our home, I have baked all sorts of goodies into the portal. In particular, I have implemented (and tailored) the data collection capabilities of the entire household. At this time, I am collecting key metrics from all of my systems as well as key state changes for every IoT device. In short, I now have a pretty satisfying operations dashboard for all of my home technology.

Bottom Line

Will my tinkering end with this iteration? If you know me, then you already know the answer. Continuous process improvement is a necessary element for the success of any project. So I expect rapid changes will be made almost all of the time – and starting almost immediately. And as a believer in ‘agile computing’ (and DevOps product practices), I intend to include my ‘customer(s)’ in every change. But with this release, I really do feel like my HA system can (finally) be labeled as v1.0! 

VPNFilter Scope: Talos Tells A Tangled Tale

IoT threats
Hackers want to take over your home.

Several months ago, the team at Talos (a research group within Cisco) announced the existence of VPNFilter – now dubbed the “Swiss Army knife” of malware. At that time, VPNFilter was impressive in its design. And it had already infected hundreds of thousands of home routers. Since the announcement, Talos continued to study the malware. Last week, Talos released its “final” report on VPNFilter. In that report, Talos highlighted that the VPNFilter scope was/is far larger than first reported.

“Improved” VPNFilter Capabilities

In addition to the first stage of the malware, the threat actors included the following “plugins”:

  • ‘htpx’ – a module that redirects and inspects the contents of unencrypted Web traffic passing through compromised devices.
  • ‘ndbr’ – a multifunctional secure shell (SSH) utility that allows remote access to the device. It can act as an SSH client or server and transfer files using the SCP protocol. A “dropbear” command turns the device into an SSH server. The module can also run the nmap network port scanning utility.
  • ‘nm’ – a network mapping module used to perform reconnaissance from the compromised devices. It performs a port scan and then uses the Mikrotik Network Discovery Protocol to search for other Mikrotik devices that could be compromised.
  • ‘netfilter’ – a firewall management utility that can be used to block sets of network addresses.
  • ‘portforwarding’ – a module that allows network traffic from the device to be redirected to a network specified by the attacker.
  • ‘socks5proxy’ – a module that turns the compromised device into a SOCKS5 virtual private network proxy server, allowing the attacker to use it as a front for network activity. It uses no authentication and is hardcoded to listen on TCP port 5380. There were several bugs in the implementation of this module.
  • ‘tcpvpn’ – a module that allows the attacker to create a Reverse-TCP VPN on compromised devices, connecting them back to the attacker over a virtual private network for export of data and remote command and control.
Disaster Averted?

Fortunately, the impact of VPNFilter was blunted by the Federal Bureau of Investigation (FBI). The FBI recommended that every home user reboot their router. The FBI hoped that this would slow down infection and exploitation. It did. But it did not eliminate the threat.

In order to be reasonably safe, you must also ensure that you are on a version of router firmware that protects against VPNFilter. While many people heeded this advice, many did not. Consequently, there are thousands of routers that remain compromised. And threat actors are now using these springboards to compromise all sorts of devices within the home. This includes hubs, switches, servers, video players, lights, sensors, cameras, etc.

Long-Term Implications

Given the ubiquity of devices within the home, the need for ubiquitous (and standardized) software update mechanisms is escalating. You should absolutely protect your router as the first line of defense. But you also need to routinely update every type of device in your home.

Bottom Line
  1. Update your router! And update it whenever there are new security patches. Period.
  2. Only buy devices that have automatic updating capabilities. The only exception to this rule should be if/when you are an accomplished technician and you have established a plan for performing the updates manually.
  3. Schedule periodic audits of device firmware. Years ago, I did annual battery maintenance on smoke detectors. Today, I check every device at least once a month. 
  4. Retain software backups so that you can “roll back” updates if they fail. Again, this is a good reason to spend additional money on devices that support backup/restore capabilities. The very last thing you want is a black box that you cannot control.

As the VPNFilter scope and capabilities have expanded, the importance of remediation has also increased. Don’t wait. Don’t be the slowest antelope on the savanna.

Alexa Dominance: Who Can Compete?

Alexa Dominance
Amazon Echo devices now have a foothold in most American homes.

Voice control is the ‘holy grail’ of UI interaction. You need only look at old movies and television to see that voice is indeed king. [For example, the Robinson family used voice commands to control their robot. And Heywood Floyd used voice as his means of teaching and communicating with HAL.] Today, there are many voice assistants available on the market. These include: Amazon Alexa, Apple Siri, Google Assistant (aka Google Home), Microsoft Cortana, Nuance Nina, Samsung Bixby, and even the Voxagent Silvia.  But the real leaders are only now starting to emerge from this crowded market. And as of this moment, Alexa dominance in third-party voice integration is apparent.

Apple Creates The Market

Apple was the first out-of-the-gate with the Apple Siri assistant. Siri first arrived on the iPhone and later on the iPad. But since its introduction, it is now available as part of the entire Apple i-cosystem. If you are an Apple enthusiast, Siri is on your wrist (with the watch). Siri is on your computer. And Siri is on your HomePod speaker. It is even on your earbuds. And in the past six months, we are finally starting to see some third-party integration with Siri.

Amazon Seizes The Market

Amazon used an entirely different approach to entrench its voice assistant. Rather than launch the service across all Amazon-branded products, Amazon chose to first launch a voice assistant inside a speaker. This was a clever strategy. With a fairly small investment, you could have an assistant in the room with you. Wherever you spent time, your assistant would probably be close enough for routine interactions.

This strategy did not rely upon your phone always being in your pocket.  Unlike Apple, the table stakes for getting a voice assistant were relatively trivial. And more importantly, your investment was not limited to one and only one ecosystem.  When the Echo Dot was released at a trivial price point (including heavy discounts), Alexa started showing up everywhere. 

From the very outset, an Amazon voice assistant investment required funds for a simple speaker (and not an expensive smartphone). You could put the speaker in a room with a Samsung TV. Or you could set it in your kitchen. So as you listened to music (while cooking), you could add items to your next shopping list.  And you could set the timers for all of your cooking.  In short, you had a hands-free method of augmenting routine tasks.   In fact, it was this integration between normal household chores coupled with the lower entry price that helped to spur consumer purchases of the Amazon Echo (and Echo Dot).

A second key feature of Amazon’s success was its open architecture. Alexa dominance was amplified as additional hardware vendors adopted the Alexa ecosystem. And the young Internet-of-Things (IoT) marketplace adopted Alexa as its first integration platform. Yes, many companies also provided Siri and Google Assistant integration. But Alexa was their first ‘target’ platform.

The reason for Alexa integration was (and is) simple: most vendors sell their products through Amazon. So vendors gained synergies with their main supplier. Unlike the Apple model, you didn’t have to go to a brick and mortar store (whether it be the Apple Store, the carriers’ stores, or even BestBuy/Target/Walmart).  Nor did a vendor need to use another company’s supply chain. Instead, they could bundle the whole experience through an established sales/supply channel.

Google Arrives Late To The Party

While Apple and Amazon sparred with one another, Google jumped into the market. They doubled-down on ‘openness’ and interoperability.  And at this moment, the general consensus is that the Google offering is the most open. But to date, they have not gained traction because their entry price was much higher than Amazon’s. We find this to be tremendously interesting. Google got the low price part down when they offered a $20-$30 video streamer.

But with the broader household assistant, Google focused first upon the phone (choosing to fight with Apple) rather than a hands-free device that everyone could use throughout the house. And rather than follow the pricing model that they adopted with the Chromecast, Google chose to offer a more capable (and more expensive) speaker product. So while they used one part of the Amazon formula (i.e., interoperability), they avoided the price-sensitive part of the formula.

Furthermore, Google could not offer synergies with the supply chain. Consequently, Google still remains a third-place contender. For them to leap back into a more prominent position, they will either have to beat ‘all-comers’ on price or they will have to offer something really innovative that the other vendors haven’t yet delivered.

Alexa Dominance

Amazon dominance in third-party voice integration is apparent. Not only can you use Alexa on your Amazon ‘speakers’, you can use it on third-party speakers (like Sonos). You can launch actions on your phone and on your computer. And these days, you can use it with your thermostat, your light bulbs, your power sockets, your garage door, your blinds, and even your oven. In my case, I just finished integrating Alexa with Hue lights and with an ecobee thermostat.

Bottom Line

Market dominance is very fleeting. I remember when IBM was the dominant technology provider. After IBM, Microsoft dominated the computer market. At that time, companies like IBM, HP, and Sun dominated the server market. And dominance in the software market is just as fleeting. Without continually focusing on new and emerging trends, leadership can devolve back into a competitive melee, followed by the obsolescence of the leader. Indeed, this has been the rule as dominant players have struggled to maintain existing revenue streams while trying to remain innovative.

Apple is approaching the same point of transition. Their dominance of the phone market is slowly coming to an end. Unless they can pivot to something truly innovative, they may suffer the same fate as IBM, Sun, HP, Dell, Microsoft, and a host of others.

Google may be facing the same fate – though this is far less certain. Since Google’s main source of revenue is ‘search-related’ adverstising, they may see some sniping around the edges (e.g., Bing, DuckDuckGo, etc). But there is no serious challenge to their core business – at this time.

And Amazon is in a similar position: their core revenue is the supply chain ‘tax’ that they impose upon retail sales. So they may not see the same impact on their voice-related offerings. But they dare not rest upon their laurels. In candor, the Amazon position is far more appealing than the Google position. The Amazon model relies upon other companies building products that Amazon can sell. So interoperability will always be a part of any product that Amazon brands – including voice assistants. 

Only time will sort out the winners and losers. And I daresay that there is room enough for multiple ‘winners’ in this space. But for me, I am now making all of my personal and business investments based upon the continued dominance of Alexa.

Gibbs’ “Home Automation” Boat

Pardon the coy title. But I love “NCIS”. And some days, I feel a lot like Gibbs: I have a pet project in the basement. And whenever I get bored or frustrated, I work on my project. But as of now, I think that my home automation project is finally sea-worthy.

My little Pi is a beast. I am running a long list of software on this device. I’m running Home Assistant with the following add-ons: DuckDNS, Let’s Encrypt, Mosquitto, Node-RED, Samba, and SSH. With this combination, I can monitor assets within the home. I can determine whether my wife and I are in the home or outside of the home. I can build automated tasks based upon data collected within the home. I can manage the assets and the compute infrastructure in the home. And I can secure it against exploitation by ‘bad actors’.

And now, I’ve finally gotten around to configuring the data collection and graphing infrastructure. The package of tools that I am using for this includes Grafana and InfluxDB. After installing the components, I got about the work of configuring the software.

InfluxDB is a time-series data repository. It is designed much like a NoSQL tool; data is written in series but isn’t updated after it is originally written. Later, the data is read serially and used in graphing and/or statistical studies. Fortunately, I was able to configure InfluxDB with very little incident. I think that years of econometric studies made this part relatively simple to implement.

Once done with the database, I turned my attention to Grafana itself. And it was very difficult to grok this tool. First, I ran into quite a bit of difficulty installing needed plugins. After poring over the logs (and consulting written guides on the Internet), I found that the “behind the curtain” instance running in a container was having difficulty downloading the ‘plugin’ components on the fly.

While scratching my head, I saw a quick popup about my dynamic address being updated. That’s when the light came on. For whatever reason, I had been having trouble with a sporadic inability to log into my system. The symptoms were that the login would just wait, and wait, and wait. I finally remembered that some applications really dislike running inside a VPN tunnel. And worse still, I wondered if the IP address recorded in DNS reflected a potentially changing DNS entry.

So I disconnected from my VPN. That’s when things just started to work. It was quite odd, though. I could finally add the plugins. But I had changed the network on my Windows system – and not the network on the Pi. So there has to be something flowing through the browser. I’ll have to dig into that. But the problem had been solved.

I also found that I needed to update the DNS on my little server. Simply put, I had been using the Pi-hole (an ad blocking DNS server) to fulfill the DNS requests for the Home Assistant Pi system. I suspected that certain key DNS requests returned with null results. Therefore, I needed to clean up the DNS config on the Home Assistant Pi.

Once both of these tasks (i.e., the VPN and the Pi-hole DNS) were resolved, the plugins started to install. So my Grafana installation could proceed.

And then I hit the learning curve of Grafana itself.

Grafana is a very cool tool. But its user interface is not very intuitive. It took a few hours to figure out just how to add variables and select the right graphing interval before real data started to emerge. But once I learned these little tricks, the graphs became easy. Last night, I began the quest to graph all of the data that I could graph. This compulsion is a learned experience; I spent many years being driven by capacity and performance data. And I wasn’t harnessing the data that is coming from my home sensors. So I am now inspired to build all sorts of data models and graphs. [Note: I really love it when my past experiences can inform my current and future activities.]

All in all, I’ve spent a few hundred hours over a few months on this home automation project. And I have learned so much about home automation, container technologies, and web security at the edge of the network. Now I’m left with one nagging thought – and an irresistible question: How does anyone expect the average homeowner to know these things. Moreover, how can we expect consumers to care enough to learn these things? Most people want the “iPhone experience” where they can spend a lot and have someone else do the integration for them.

So which are you? Are you a maker/builder/integrator? Or are you a buyer?