cloud-banner

Wednesday, 11 July 2012 15:17

Big Data Collection and Security

Written by

Rarely a day goes by that I don’t speak with a customer, prospect, industry analyst or reporter about big data. The companies I talk to are typically at the early stages, or “collection” phase. They are implementing the infrastructure and starting to gather streaming information from all kind of sources. In fact, they’re collecting data at a rapid pace, while still unsure how or even what information will be used. Most of these organizations have a vision of what insights will be derived from big data, but at the same time, they also expect their hypothesis to change as the project moves forward.

As strange as it may sound, I think this is actually a good thing. It’s great for the market adoption of big data technologies and especially for the up-and-coming companies like 10Gen, Datastax, Cloudera and the like. I think this will be the norm for a while, where big data projects are started with a rough idea of the downstream insights and benefits but not as well defined as may have been in the database projects days of RDBMS.

My advice is to do what I am seeing others do – start collecting. Start streaming available data into your systems. Yes, have a working hypothesis, but don’t wait until it is all baked. I think you, as others, can adapt, as the retained data grows.

And, if you too are in this “collection” phase, please think about securing and protecting the data you are gathering and storing. Make sure that there are safeguards in place so the data can’t leak out.

A best practice is to simply encrypt the data and store the key securely, but even if you don’t, have a discussion with your team on the sensitivity of the data you are collecting. Discuss the potential of negative impact if it were to get outside of the organization or even outside of the team that has access to it.

Over time, I believe this big data market will mature from the current collection phase to more structure, analysis, insights and active decision-making. At that time, I can see huge amounts of “not needed” data being purged, siphoned off or simply ignored. But, we’re not there yet. So, start an experiment or two with big data architectures and start collecting. Your competitors have probably already started.

Tuesday, 10 July 2012 13:45

zNcrypt Chef Cookbook Part III

Written by

In Part I of this blog we went over some tips and basic steps to creating a new Chef Cookbook. In Part II of this series we explored in detail the zNcrypt cookbook and recipe to perform a basic installation of zNcrypt. In this edition we will use chef data bags to activate the zNcrypt installation.

big data nerd bag-p1496631464820063122wl62 125

Data bags are very useful to pass configuration information to recipes using json. For zNcrypt, we will use a data bag to pass license/passphrase information to the cookbook. There are two basic ways to setup a data bag, you can use the knife command or you can setup the data bag programmatically.

knife data bag create BAG [ITEM] (options)

In our zNcrypt cookbook we will not use knife commands but rather setup the data bag programmatically. Let's review how we do this in the default.rb recipe. We start with a data_bag('license_pool') command to check if the data bag exists, if this call fails the "rescue" section will setup the new data bag.

|# check if the data bag exists, use a begin / rescue to handle the exception
begin
 # check if there is a license pool already and skip creating
 data_bag('license_pool')
rescue

Here in the rescue section of the code, we will use the OpenSSL cookbook to generate a strong password, then setup a license and activation code for each of the servers in our environment. See the openssl cookbook for more information on how to use the secure_password https://github.com/opscode/cookbooks/tree/master/openssl

|# check if the data bag exists, use a begin / rescue to handle the exception
begin
 #include the secure password from openssl recipe
 ::Chef::Recipe.send(:include, Opscode::OpenSSL::Password)

 # create a data bag for licensing pool
 license_pool = Chef::DataBag.new
 license_pool.name('license_pool')
 license_pool.save
 # create json for data bag item for each node
 ubuntu = {
   # use the node name as the id
   "id" => "ubuntu",
   # set your product key provided by Gazzang
   # this license will auto reset every hour, if your first registrationi
   # fails try again in an hour or contact sales@gazzang.com
   "license" => "XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX",
   # set your activation code provided by Gazzang
   "activation_code" => "123412341234",
   # random passphrase
   "passphrase" => secure_password,
   # random passphrase
   "passphrase2" => secure_password,
 }
 databag_item = Chef::DataBagItem.new
 databag_item.data_bag('license_pool')
 databag_item.raw_data = ubuntu
 databag_item.save

Now that we have setup the data bag, let's see how it will be used in our cookbook to activate zNcrypt in the activate.rb recipe. We will use the Chef "node.name" attribute to select the license that matches the server. We can then construct the string to pass as argument to the ezncrypt-activate command.

# check if there is a license pool otherwise skip activation
 data_bag('license_pool')
 license=data_bag_item('license_pool',"#{node.name}")['license']
 activation_code=data_bag_item('license_pool',"#{node.name}")['activation_code']
 # we also need a passhprase and second passphrase, we will generate a random one
 passphrase=data_bag_item('license_pool',"#{node.name}")['passphrase']
 passphrase2=data_bag_item('license_pool',"#{node.name}")['passphrase2']
 # build the arguments to the activate command
 activate_args="--activate --license=#{license} --activation-code=#{activation_code} --passphrase=#{passphrase} --passphrase2=#{passphrase2}"
 script"activate zNcrypt"do
  interpreter"bash"
  user"root"
  code<<-eoh codemkdir="" var="" log="" ezncrypt="" ezncrypt-activate="" activate_args="" eoh="" end="" lt="" pre="">

One problem with this example is that the data bag stores the encryption password in clear text. In future blogs we will see how we can use Chef encrypted data bags to protect the encryption password.

As you can see data bags are a very useful method to pass configuration to cookbooks. Another method to pass configuration information to cookbooks is using Chef Attributes. Please read the next blog to see how we will use Chef Attributes to setup the zNcrypt configuration directories.

Wednesday, 27 June 2012 19:00

Opaque Object Storage

Written by

Gazzang has built an interesting business around Transparent Data Encryption, building on top of eCryptfs, adding some mandatory access controls and policy management in a product we call zNcrypt. 

With the addition of zTrustee to the Gazzang product portfolio, we have entered an even more interesting, ambitious, new space in modern Cloud computing -- Opaque Object Storage.

eCryptfs and zNcrypt provide "transparent" data encryption, in that when the encrypted filesystem is mounted, allowed users, applications and processes (possessing the appropriate keys) are allowed to seamlessly read and write data as regular files and directories to the mounted storage filesystem.  No user, application, or process needs to know anything about encryption -- they simply read and write data "transparently" from and to files and directories.  Input/output operations are trapped in the Linux filesystem layer, and eCryptfs handles encrypting and decrypting files as necessary.  Assuming you have safeguarded your keys appropriately, an offline attacker with physical or remote access to the disk would not have access to mounted filesystem and instead only see the cryptographically protected data.

zTrustee was designed from the ground up to store and serve the keys necessary to make eCryptfs and zNcrypt work properly.  But we implemented it in a manner in which we can store and serve keys, certificates, files, directories, and data of any type -- similar to some object storage systems.  However, we added our considerable security expertise to our implementation, and use encryption yet again to our customer's advantage.  Each of these objects stored in zTrustee are actually encrypted and signed with the public GPG keys of the client storing and/or retrieving the data.  This means that even an administrative user with full root access on the zTrustee server will not have introspection into the contents of the data blobs stored as deposits on the zTrustee server.  For this reason, we're calling these deposits "Opaque Objects", and noting that our zTrustee server provides "Opaque Object Storage". 

Moreover, the fine-grained security policies that govern the release of these deposits further differentiate zTrustee from various other object storage products.  Beyond the individual encryption of each zTrustee deposit (object), the policy by which an object is released can:

  • limit the  (time to live) TTL
  • limit the number of times it can be retrieved (e.g. Mission Impossible message)
  • be disabled (and later enabled)
  • be purged entirely
  • required 0 - N  authenticate and "vote" or "sign off" on the release trustees
  • be retrieved by an authenticated, signed/encrypted client, or
  • be retrieved anonymously using a nonce URL
With this level of policy control, encryption, and cryptographic signature enforcement, we believe we've built something really quite interesting and useful for modern Cloud computing applications.

Stay tuned for some examples!
:-Dustin
Wednesday, 27 June 2012 09:36

Gazzang Top 5 - GigaOm Structure edition

Written by

There’s really nothing like a cross-country flight from San Francisco to Austin in late June to really shake up your senses. We departed under cool 55-degree temperatures only to arrive back home to a balmy 101. I imagine this is how a piece of firewood must feel as it’s pulled from a cool, dry resting place and tossed into a blazing furnace.

But that’s life on the road, I guess. Or at least it was last week, when Gazzang hit the GigaOm Structure Conference in San Francisco to launch zTrustee.

Below are a few thoughts from last week, along with a tremendous analyst write up on our just announced universal key management solution.

    • We heard and participated in a number of good discussions on cloud security at Structure. It’s great to see this topic continue to be dissected and debated. I think everyone can davidinterviewagree there’s no single solution to cloud security, especially when you factor in escalating compliance mandates, international data privacy regulations and the rapid adoption of big data as a service.

 

    • We believe the right approach is to adopt layers of security with encryption and key management at the core. Here’s a photo of me being interviewed on the subject ---à

 

    • It was great to connect with new GigaOm Pro analyst, Davi Ottenheimer. Kudos to GigaOm on bringing him into the fold. We spent a good 90 minutes with Davi on Wednesday, talking security for big data and demoing zTrustee. We look forward to seeing GigaOm take on a greater security focus in the future.

 

 

Over time, we saw the opportunity to take what we learned from our KSS and expand its capabilities to manage keys from all encryption utilities as well as tokens, certificates and other bits of important IT DNA. So begat zTrustee, and the solution is hitting the market at just the right time. As Steve Coplan from 451Resarch notes:

“The outcome of hybridization is that there is more need to maintain trust across domains and insulate proprietary data running on third-party services from unauthorized access. This all adds up to greater likelihood of cryptographic keys and security materials floating around. In this initial iteration, Gazzang is providing an operational fix to a clear problem.”

  • Finally, a quick travel tip for folks tired of overspending on hotels. Book through HomeAway. We found an awesome four-bedroom townhouse in Pacific Heights, about two miles patiofrom the conference. Just check out the patio and zen garden. You can’t find this at the Staybridge Suites. At around 4:30pm on Tuesday, Dustin Kirkland and I took an 8.5-mile jog to the Golden Gate and back.

Had we run that distance in Austin, I’m fairly certain my feet would have melted.

There’s a great Wall Street Journal blog by Rachael King on the increase in cyber attacks targeting privileged accounts. A recent survey she sites suggest 64% of IT administrators “believe that the majority of recent security attacks involved the exploitation of so-called privileged accounts.” These are access points built into devices by the manufacturer that make it easer for IT to manage the network. 

I’m a big proponent of multi-layer security that includes security controls not singularly tied to user identity or some proxy user identity method. This is particularly problematic at the OS level but certainly at the database and other levels as well.

In previous roles, I’ve worked with a number of developers and architects who believed OS-level file access controls granted to a user were good enough; that layered security wasn’t always necessary. I realize writing good crypto, key management, and other security code, or finding folks who can, is tough and expensive, but that’s not an excuse for failing to implement multiple levels of security.

There’s a reason your bank locks its doors, and then they lock the safe, and then they lock the safe deposit boxes. Security is about layers, and with virtualization, cloud, big data, and more, building out these layers becomes even more important.

Mike.Frank.Blog

Gazzang is focused on helping organizations secure big data in the cloud.  Our bread and butter is Unified Transparent Encryption through Gazzang zNcrypt. This solution encrypts sensitive data in Linux on the fly as it’s written to disc.  Access to that data is managed by our key storage system and process-based access control lists. We designed the system to limit OS and root privileges, preventing data from being stolen via “privilege escalation.” 

Speaking of layered security, I want to wrap up this blog with some additional thoughts on Gazzang zTrustee, which we announced earlier this week. Our chief architect, Dustin Kirkland, touched on this some yesterday, but fundamentally zTrustee focuses on securing and providing access to those keys, certificates, tokens and other “opaque objects” that act as the gatekeepers to sensitive information about your IT DNA.

This data, if it were to fall into the wrong hands through privileged accounts or other form of attack, could be disastrous to your organization. It’s not unusual to have robust policies governing access to keys, certs, tokens and passphrases, but what about some of the more obscure files or file directories like those containing ACLs or connection strings?

An interesting method of protecting this data is with the concept of a “trustee.” A trustee – often a person or group of persons but could just as well be a service or a combination of the two – control the release of keys and prevent “privilege escalation.”

As we bring zTrustee to market, I’m certain we’ll discover more use cases and continue to innovate on the trustee concept. If you’re interested in learning more, sign up for our free trial.

Wednesday, 20 June 2012 19:20

Introducing Gazzang zTrustee!

Written by

I'm out at the GigaOM Structure conference in sunny San Francisco this week, where Gazzang has launched its newest product -- Gazzang zTrustee! My colleagues and I have dedicated the last 6 months to the design, architecture, development and testing of this new product, and I'm thrilled to finally be able to speak freely about it.

Gazzang's original product, zNcrypt is a transparent data encryption solution -- a GPLv2 encrypted filesystem built on top of eCryptfs, adding mandatory access controls and a dynamic policy structure. zNcrypt enables enterprise users to secure data in the cloud, meet compliance regulations, and sleep well at night, ensuring that all information is encrypted before written to the underlying storage.

As of today, Gazzang's newest product, zTrustee is an opaque object storage system, ultimately providing a flexible, secure key management solution for data encryption. Any encryption system, at some point, requires access to keys, and those keys should never be stored on the same system as the encrypted data. While zTrustee was initially designed to store keys, it can actually be used to put and get opaque data objects of any type or size.

Planet Ubuntu readers might recognize a few small-scale ancestors of zTrustee in other projects that I've authored and talked about here in the past... The encrypted pbputs and pbget commands now found in the pastebinit package are similar, in principle, to zTrustee's secure put and get commands. But rather than backing uploads with a pastebin server, we have implemented a powerful, robust, enterprise-ready web service with extensive, flexible policies, redundancy, and fault-tolerance. The zEscrow utility and service are also similar in some other ways to zTrustee, except that zEscrow is intended to share keys with a backup service, while zTrustee blindly and securely stores opaque objects, releasing only to authenticated, allowed clients per policy.

Planet Ubuntu readers may be pleased to hear that our zTrustee servers are currently running Ubuntu 12.04 LTS server, replicated across multiple cloud providers. The RESTful web service is built on top of a suite of high quality open source projects, including: apache2, python wsgi, postgresql, sqlalchemy, postfix, sks, squid, gnupg, and openssl (among others).

The zTrustee client is a lightweight python utility, leveraging libcurl, openssl, and gnupg to send and receive encrypted, signed JSON blobs, to and from one or more zTrustee servers. The client utilizes the zTrustee Python library, which does the hard work, encrypting, decrypting, and processing the messages to and from the zTrustee server. You'll soon be able to interface with zTrustee using either the command line interface, or the Python library directly in your Python scripts.

We've turned our current focus onto Android, while developing a Java interface to zTrustee, so that Java programs and Android applications will soon be able to interface with zTrustee, putting and getting certificates and key material and thereby enabling mobile encryption solutions. Looking a little further out down our road map, we'll also use these Java extensions to support zTrustee clients on iOS, Mac, and Windows.

While I'm big fan and proponent of eCryptfs and zNcrypt, I plainly recognize that there are lots of other ways to encrypt data -- dmcrypt, TrueCrypt, FileVault, BitLocker, HekaFS, among many others. From one perspective, encrypting and decrypting data is now the easy part. Where to store keys, especially in public/private/hybrid cloud environments, is the really hard part. Many people and organizations have punted on that problem. Well as it happens, I like hard problems, and Gazzang likes market opportunities and for that, we're both proud to promote zTrustee as a new solution in this space.

This post is intended as a very basic or brief introduction to the concept, and I'll follow this with a series of examples and tutorials as to how you might use the zTrustee client, library, and mobile interfaces.

Cheers,
:-Dustin

Monday, 18 June 2012 16:15

Don’t Forget to Secure your Backup

Written by

Backup is a very important component of data that is way too often misunderstood or ignored altogether. At least that's been my experience for several years now. From a security standpoint a backup – especially transportable “export/import” type backups that all databases offer in some form or another – presents an easy target for data theft.  Often that theft goes unnoticed or unreported. 

Just as many open source products, databases and data stores fail to offer transparent data encryption (TDE) to protect all the data in the database, so too do backups. “Unified Transparent Encryption” with Gazzang zNcrypt provides effective data encryption and key management for backup and recovery.

Last October, I wrote a blog called, Running a Secure (Encrypted) MySQL Backup Using mysqldump on Linux. The idea was to help zNcrypt users take some simple steps to protect their mysqldump jobs – securing the user/password credentials and the back files as well.  The blog grabbed the attention of MySQL guru and Oracle ACE Director Ronald Bradford, who wrote about it his latest book, MySQL Backup and Recovery Essentials.

Most often, a combination of backup types are needed to fully provide the high availability and disaster recovery needs. Fortunately, the benefits of Unified Transparent Encryption go beyond export/import or other native database backup utilities. Its also applicable to operating system and file-oriented methods.  With zNcrypt in place, applications read and write data in the same format as always – that’s the transparent encryption part. This is provided by a stack virtual filesystem.  The OS users can see these files, which are encrypted and secure.  Those files can safely be copied (backed up) and restored on another system. If that system has zNcrypt installed and the same key is configured, that data can once again be accessed via transparent encryption.

With this, data can be transferred in its encrypted form safely.  From enterprise to cloud, cloud to cloud, cloud to enterprise.  We’ve provided an example of this in prior blog. A number of Gazzang customers use this method via Zmanda based backups.

Taking this a step further, with something like DRDB or R1soft backup products that backup and synchronize at the block level, all the blocks are encrypted, the data is protected and as long as zNcrypt is installed with the same key on backup and recovery servers, it all works. 

Unified Transparent Encryption is very popular with Gazzang customers running big data. Value, flexibility, and ease of use are important for those big data architectures or search solutions like SolR where the same challenges exist. We’ll save those details for another blog, but certainly reach out if you need is imminent and we can share the secure data lifecycle as well with you.

Again, sensitive data is at risk anywhere it’s stored, backed up, exported, or imported.  Proper IT security involves mapping your data’s lifecycle and finding and remediating risks .  Encryption is a great security tool, but it can be hard to code and create on your own. Gazzang’s zNcrypt customers use the product day in day out to solve these challenges with simple and elegant Unified Encryption. If you’ve got a challenge that’s got you stumped, I’d be happy to have a look. Its amazing what Unified TDE can do.

Dear People who make me create passwords,

Hi there. You don’t really know me, although you pretend like you do. I get lots of emails from you. Most go straight to my spam folder, but some manage to hit my inbox now and then. Let me take a moment to respond to a few of those right now:

  • Yes, I enjoyed my recent stay at the Westin, thank you very much.
  • No, I do not wish to have my spider veins lasered off… yet.
  • My Klout score is still lower than Ron Paul’s polling numbers? Seriously?
  • I don’t know you. So no, I don’t want to connect on LinkedIn.

That’s right. I’m on LinkedIn, which means like millions of other career-minded individuals, I was forced to change my password on Wednesday. You may already know this about me, what with your tracking cookies and all, but I work for a  cloud data security company. So it may come as a surprise that there are few things in this world I dislike more than changing my password. In fact, if I were to put together a list of things that bug me, it would probably look something like this:

  • People who text while driving
  • Parents who send their kids to daycare knowing they’re sick
  • Changing my password
  • Notre Dame football
  • Any pastry or dessert that contains coconut

Here’s my issue with passwords. It takes time and effort to create a good one that’s also easy to remember. And that password is often linked to confidential information about my family, friends, co-workers, my fantasy football team and me.  So in exchange for the use of your Web site, I provide you with this password (along with my sensitive data) with the understanding that you’ll do everything in your power to keep it safe.

And yet here we are today. Passwords from LinkedIn, Last.fm, eHarmony are being passed around on the Internet like pictures of Kate Upton. I realize that as a user, we are responsible for creating unique passwords that don’t contain the word, “password,” but our collective ignorance is no excuse for poor data protection. 

My request is simple. All I want you to do is protect my password. I don’t think that’s too much to ask. Just encrypt it. Add some SALT to your SHA-1 or SHA-2 passphrase. Use an RSA key. Bury the passwords in your backyard. I don’t care, just be more careful with them.

Data encryption is now so inexpensive, simple to implement and high performing, that it’s downright absurd when passwords get leaked.

So let’s make a deal right here and now. I’ll stop using abc123 as my online banking login, if you promise to make it more difficult for hackers to steal my password.

Also, could you throw in 30% off Jet Ski rentals for Fourth of July weekend?

Thanks,
David

Tuesday, 05 June 2012 12:15

Gazzang Top 5 - MIT and Mongo Edition

Written by

A few weeks back, I mentioned Gazzang was headed out to the MIT CIO Symposium and MongoNYC to showcase our big data security solutions. Then, like a wicked slice from JudgeGolferSmails, I just left you all hanging. No recap. No photos. Nothing.

Well, that ends here. Below are my top five thoughts from three long days in the northeast:

  1. Big data is gaining traction with CIOs - And why wouldn't it be? Panelist James Noga, vice president and CIO at Partners Healthcare, said that big data is worth $300 billion in the US health care industry alone.

  2. Screw rocket science. Data science is where the action is - While NASA’s funding is being slashed, the relatively new field of data science is all the rage. What a data scientist is exactly is still up for debate, but organizations focused on problem solving vs. numbers crunching are actively hiring these positions.

  3. Schools should stop teaching calculus - I wish my 11th grade math teacher were in attendance when one of the big data panelists said, “stop teaching calculus and start teaching conditional stats and probability. The world is a messy place.”

  4. Since the panelist didn’t stay with us at the Hotel Pennsylvania in Manhattan, my guess is the “messy” he refers to involve the velocity, variety and volume of data coming into your business. Big data comes in all shapes and sizes and doesn’t lend itself neatly to a table, spreadsheet or relational database. The rules governing calculus simply don’t account for conditions or sentiment.

  5. It takes very little brainpower to set up a tradeshow table. However, it takes three executives with multiple degrees (and several roles of packing tape) to dismantle and pack it.

  6. GazzangTradeShowTable
  7. MongoNYC was absolutely the place to be - As much as we enjoyed our time at the MIT CIO Symposium; we couldn’t wait to meet up with the folks from 10gen and MongoDB in New York.
  8. The event was incredibly well run, and we had great deal of traffic at the booth. This was really my first time telling the big data encryption story to Gazzang prospects, and I was fired up to hear that the message is resonating. In fact, many of my conversations started off just like this:

    Me (to attendee staring at our Prezi video): “Hi. Do you have any questions?”

    Attendee: “Yeah, what do you guys do?”

    Me: “We provide data security solutions for MongoDB.”

    Attendee: “Awesome. I want that!”

    From there, we talked about encrypting sensitive data in their Mongo environment, managing the encryption keys and how Gazzang can provide a holistic view of their IT environment, courtesy of their machine data.

    We saw a nice cross-section of verticals (healthcare, retail, financial services, SaaS), each with a unique set of data security challenges and opportunities. The sessions were very informative too.

  9. Bonus thought - No matter how hungry you think you are at 1:30am, do not eat the sausage, egg and cheese biscuit on the night train from Boston to Penn Station. It’s just not worth it.

The notion of predicting the future is nearly as old as time itself. In prehistoric days, seers and prophets would carve their predictions on cave walls. The example on the right, created during the Paleolithic era, is the earliest known rendering of what would ultimately become the curiously popular game known as "Angry Birds." pabasa

From Nostradamus to H.G. Wells to Conan O’Brien, visionaries perform a noble and valuable service of informing the public. Finding someone who can look at the data at their disposal and extrapolate some sort of future state is a unique and often lucrative skill.

Steve Jobs vision of a do-it-all handheld device turned Apple’s fortunes around. Alexander Graham Bell believed that shouting was not an effective form of long-distance communication. His foresight lifted the spirits of millions of lonely grandparents who retired to Florida.

If history tells us anything, it’s that organizations that go to great lengths to hire a futurist are destined for greatness. Unfortunately for us, most visionaries prefer being paid in cash vs. data security software licenses.

We were however able to afford one prophetic fellow unafraid to share his views –however bleak- on the future of big data.

Click below to view the videos... at your own peril.