With the emergence of Cloud computing and the rise of Big Data analytics the role of information security has become more important at the same time as the ‘status quo’ of security patrolled silos and systems are being opened up.

In risk terms the problem focus has shifted from on premise managed systems to off premise environments where the management of the infrastructure and systems is carried out by outsiders. Users are accessing corporate systems on a variety of devices and in a variety of workplaces.

To further complicate this pattern, many organisation information stores are not exclusively on or off premise, but both in a hybrid architecture implementation.

The goal of Information Security still remains the same, namely to protect the Confidentiality, Integrity and Accessibility of the information to authorised users.

‘Classic’ Network Centric Information Security

Information in networked or on premise systems is normally secured using a layered approach to the interaction between systems. This provides a ‘Defence-In-Depth’ approach to the protection of information, with controls and protections implemented at each level of the model to ensure that potential attackers have to circumvent different layers of security to gain access to information.

Controls are implemented on both the sending and receiving networks to ensure secure information communication between the two.

On premise systems are usually under the management of IT operations and they communicate with business owners and managers to implement policy, procedures and monitoring of system access by implementing controls at each layer.


However, the number and frequency of breaches has increased steadily over time, to the point which it is probable that most organisations would be breached at some point. This has been brought about in part by

  • The complexity of managing the layered approach.
  • Layers are usually managed separately by operations, development and helpdesk as well as business users.
  • Outsourcing part or all of the management of systems to other organisations.
  • The opening up of organisation business processes to include trading partners or third parties and the emergence of value creation networks that span several organisations.
  • The primacy of network availability, without which Denial of Service occurs, means that there is insufficient consideration of security at application design time.

Security for Cloud and Big Data Systems

Data held in big data systems consists of a mixture of data sources of varying types of maturity, from cloud hosted database systems that contain curated structured data to completely unstructured data files in many formats. Big data is characterised by data that exhibits the 3V’s (Volume, Velocity and Variability).

Data Variability

The more structured data is subject to ‘schema on write’ enabling fields to be identified and classified according to their sensitivity. This data has often been cleansed and transformed, as happens in data warehouses.

Where data is unstructured or poorly described, the schema will often be ‘schema on read’ or held in NOSQL databases or in file table storage. These sources will often only access the data they require when needed and may contain sensitive and non-sensitive data in the same file. In certain cases, this may only be apparent when the files are accessed for analysis purposes.

In some cases the data is completely unstructured and not schematized making even the identification of sensitive information even more challenging. Data may be classified at the highest level as ‘CCTV images’, but the actual content is unknown until it is viewed.

Cloud technologies are also used to provide storage only, where documents can contain any type of information, sometimes with commercial or intelligence data contained within them.

Data Velocity

Big data systems may also contain real time Internet of Things (IoT) and time series data. This data is used to provide real time analytics from remote sensors and can include many types of sensitive data, including customer transactional data. This data can be used to either provide real time intelligence or can be aggregated to give historical trend data.

Data Volume

Multi-terabyte and petabyte sized data stores are becoming increasingly common, and cloud systems are used to store and analyse this data into information and insight. As the size of the data stores and databases has grown, so has the risk and occurrence of large data breaches.

Big Data Architectures & Security

Cloud analytics systems may implement a ‘Data Lake’ approach to gathering and storing all types of data logically together before transformation and use, or else may employ a Lambda Architecture to merge immutable master data sources and streamed/time series data together.

Big Data architectures are designed with speed to insight in mind, but this may come at the expense of security and data governance. This could leave the business exposed to non-compliance, data breach and penalties from regulators if it continued unchecked.

Data analysis is carried out on models that utilise the data at its’ finest grain and any information security protections must be afforded at this lowest level.

IT departments working with Big Data are adapting the ‘DevOps’ business architecture model, with combined operations and development teams configuring cloud environments, often provisioning new instances of virtual machines and environments, adding data to databases and cloud storage.

Applications and analyses are carried out by the business utilising a range of tools, applications across different business domains and contexts. Research with data takes data scientists across system and business boundaries to arrive at novel business insights.

The ability to access datasets by specific role or business user, and to carry that authorisation across different business domains and problems is also a critical component of an integrated information security approach.

Cloud hosting is an enabler for the advanced analytics of big data, but it also cuts down the options available to organisations to deal with information security.

The ‘full stack’ OSI model of security is not configurable when only the Session, Presentation and Application, (and some transport) layers are available to the organisation and the cloud provider provides managed public infrastructure and operating systems.

Information Security still represents a large and growing risk for organisations and the approach to minimising this risk in Cloud and Big Data systems must be adapted to the changed environment.

A Data Centric Security Approach     

In order to take a pro-active stance on security the organisation must assume a data breach has already happened and plan to protect the data assets accordingly.

Organisations must work within the restrictions of the cloud model to ensure that their data maintains the Confidentiality, Integrity and Accessibility that makes it an asset.

Information Security has moved from being a network focussed activity to being a host and application focussed activity.

  1. Review Information Security Requirements

The business must first develop a policy to lay out exactly which types of data assets and information sources are to be defended against disclosure. It should also set policies and guidance on the types of devices that can be used to access the data.

This should include a scope review of regulatory and data protection requirements and how data is handled through the information life cycle (See my post here of an example of how the OAIC in Australia recommends these actions for the protection of Personally Identifiable Information Organisational Privacy Policy and Information Architecture).

As well sticking to the PII data to be protected further analysis of the risks of the machine processed data being re-identified must be made. (As an example, avoiding identification of anonymous data sets being matched to voter rolls)

Big data computing means that Information Security is moving from the domain of IT experts with an understanding of network security architectures and towards a collaborative data focused Business & IT responsibility.

The output of this stage will be a definition of the information that must be secured and discovering where it is currently held.

  1. Review Roles and Business Contexts

An inspection of the different role types within the organisation, and the contexts within which data are handled, processed and transformed should be carried out to ensure that the risks are acceptable to the business owners and that security is maintained across business unit boundaries. When data is changed from being a ‘siloed’ business unit responsibility to being an enterprise wide resource the stewardship of that data must still be maintained. Access to data in the cloud should be matched to the job role being undertaken and the business context where this is used.

The output of this stage will give an understanding of the roles and business contexts that require access to the information.

  1. Classify and Tag the Data/ Information

The information sources must be classified and tagged according to their sensitivity. Where datasets contain data of mixed sensitivity this must be tagged to the highest level of any individual data item to ensure that future usage protects this information.

Cloud providers often include ways in which to tag the data for discovery, and organisations should ensure they are aware of the methods of doing this.

Rather than being thought of as a business overhead the task of data tagging and discovery will inform and enable automation of auditing, data governance and system development. For many organisations that work in regulated environments this will be merely an extension to the existing data classification systems in use.

This stage identifies the sensitivity of data artefacts.

  1. Implement Protective Controls

So, having discovered, classified and tagged the data, and analysed the role and access requirements it should be possible to secure your information by utilising the technology options available.

This requires action by organisations at the following levels of the network information model:

osi6Information access only to authorised users and roles should be granted, and a procedure for the granting and revoking of rights should be implemented.

Defence in Depth still holds true in big data environments and administrators should ensure that multiple controls are applied to all sensitive information.

5. Management, Monitoring and Alerts

In big data environments the usage and movement of data should be closely monitored for non-compliance, and to ensure that the logical system boundaries are not breached.

Things to monitor include:

  • Data movement to and from the data stores.
  • Modifications to sensitive data should be audited.
  • Changes in access rights to both users and the data should be flagged.
  • Data and information should have access lineage and modification history to ensure unauthorised modifications are not made.
  • Unauthorised login, weak password and access alerts should be set up.
  • Cloud provider Service levels and compliance should be closely monitored.
  • Data encryption keys, key vault access and the rotation of encryption keys should be actively carried out by the organisation.

Many Cloud providers will give tenants access to monitoring tools and SIEM (Security Information & Event Management) tools to assist in this. Cloud security monitoring services can often be provisioned as ‘Security as a Service’ (SECaaS), allowing customers the benefit of security monitoring of the cloud that allows a consistent level of protection.

The purpose of the management part of the cycle is to provide a comprehensive monitored logical security perimeter that allows the users to carry out their roles whilst the data remains protected.


Information Security has never been a ‘do once and forget’ activity but is a process, and it continues to pose challenges as enterprises move towards cloud computing architectures, and the rise of bring your own device (BYOD) end user computing.

Many organisations that are taking steps towards the cloud are implementing hybrid architecture patterns, where the on premise IT systems provide curated data stores federated with cloud technology to provide lower costs and access to analytics tools.

In hybrid scenarios organisations should take a blended approach by keeping the OSI security model for systems hosted on premise, but should carry out Information security and threat modelling before any proposed migration to the cloud takes place.

Services provided by the Cloud provider should be closely monitored for intrusion or external security threats, but by taking the assumed breach approach, and by using the protective security measures available to them organisations should be able to use, report on and manage big data implementation without exposing sensitive information.

Please use the form below to submit your feedback and comments.