What’s Driving Big Data in the Enterprise – and Why Now?

Big Data on Ulitzer

Subscribe to Big Data on Ulitzer: eMailAlertsEmail Alerts newslettersWeekly Newsletters
Get Big Data on Ulitzer: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


BigData Authors: Liz McMillan, Elizabeth White, Pat Romanski, William Schmarzo, Ambuj Kumar

Related Topics: Artificial Intelligence Journal, Big Data on Ulitzer, DevOps Journal

Blog Post

A GDPR Compliance Journey | @DevOpsSummit #BigData #DevOps #FinTech #AI #ML #DX

The technology challenges, approaches, and lessons learned for the centralized testing environment

Lessons from the GDPR Compliance Journey of a Leading Financial Services Organization

In preparation for General Data Protection Regulation (GDPR) compliance, a global 100 financial services organization embarked on a journey to assess its core information processing environments with the objective of identifying opportunities to strengthen its data privacy protection programs. This article focuses on the technology challenges, approach, and lessons learned for the centralized testing environment.

Situation
Like many DevOps groups across the industry, this financial organization has adopted both continuous testing and quality testing regimes to deliver quality products leveraging agile methodology. The organization prefers to use production data to prepare the test data. While majority testing is primarily being done by an internal team, certain applications are tested by outsourced offshore teams. The test environment is fairly complex comprising Oracle, Hadoop (Parquet files), Hive, Cassandra, MS SQL, SAS, and Linux-based systems. Incremental data volume varies between 10 million to 15 million records on weekly basis. Certain major releases of Big Data-based applications require up to 5 GB data ( ~ 75 million records).

Challenges
In order to comply with the GDPR and prevent data privacy breach events, the testing team needed to detect and de-identify the PII element. If they use available de-identification methods of leveraging product-specific encryption technology like MS SQL encryption, etc., much of the data becomes unusable for testing for the following reasons:

  • Current methods scramble the data and make it unusable.
  • Current methods do not preserve any referential relationship between various data sources.

If they choose to mask the data, they are confronted with similar challenges. For example, if they want to test an application that calculates the end-of-month summary balance of a customer account using an Oracle data source and Hadoop data source - they would not able to use the data encrypted using available technology.

In addition, PII information often appears within comments and description fields - encryption or masking of the entire field would result in the loss of important information.

More important, data encryption using available methods are computationally time-consuming and require large hardware infrastructure.

Approach
The organization identified the following solution criteria to mitigate the challenges identified during the assessment:

  • Autonomous Detection: Leveraging a centralized library, a solution should examine all incoming data including embedded documents for the presence of PII elements. Solution should also be using machine learning techniques to classify sensitive documents present in a Big Data repository
  • Format Preserving Encryption: Based on the type of PII data and preference of the user, the solution should encrypt the data elements in following three modes:
    • Blind mode: It should encrypt data element if the data element matches a specific regular expression.
    • Column mode: It should encrypt the content of a specific column or a field.
    • Mixed Mode: It should encrypt the data elements within a specific column if the data element matches a specific regular expression
  • Cross Platform Referential Integrity: Solution must be able to retain referential integrity between records across platforms
  • Big Data Volume: Solution should be able to detect and encrypt sensitive data in 100 GB of data in less than one hour using commodity hardware.
  • Data Usage Monitoring: Solution should be able to record and retain information for all data privacy usage for audit and compliance. In addition, the solution should be able to identify abnormal data usage leveraging machine learning.

Lessons Learned

  1. Understand business and technology landscape: It is imperative to understand the current technology landscape, business practices and emerging trends. If your technology platform and domain is monolithic today - do you expect it to remain monolithic in the near future? What would be the impact should you move some of your testings to a cloud platform? What about Big Data applications?
  2. Evaluate risks: Assess data security risks through the lens of GDPR and beyond. In addition to the PII and PHI information, most organizations deal with sensitive data that may not be associated with an individual. How to you detect, encrypt and monitor other types of sensitive data such as B2B contract information in your testing environment?
  3. Beyond Retrofitting: Define the ideal solution characteristics prior to evaluating solutions. Retrofitting a solution to meet your business needs is often time-consuming and costly.

Internet of @ThingsExpo, taking place October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 21st Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world.

Download Show Prospectus ▸ Here

The Internet of Things (IoT) is the most profound change in personal and enterprise IT since the creation of the Worldwide Web more than 20 years ago.

All major researchers estimate there will be tens of billions devices - computers, smartphones, tablets, and sensors - connected to the Internet by 2020. This number will continue to grow at a rapid pace for the next several decades.

With major technology companies and startups seriously embracing IoT strategies, now is the perfect time to attend @ThingsExpo in Silicon Valley. Learn what is going on, contribute to the discussions, and ensure that your enterprise is as "IoT-Ready" as it can be!

With major technology companies and startups seriously embracing Cloud strategies, now is the perfect time to attend 21st Cloud Expo, October 31 - November 2, 2017, at the Santa Clara Convention Center, CA, and June 12-14, 2018, at the Javits Center in New York City, NY, and learn what is going on, contribute to the discussions, and ensure that your enterprise is on the right path to Digital Transformation.

Track 1. Enterprise Cloud | Cloud-Native
Track 2.
Big Data | Analytics
Track 3. Internet of Things | IIoT | Smart Cities

Track 4. DevOps | Digital Transformation (DX)

Track 5. APIs | Cloud Security | Mobility

Track 6.
AI | ML | DL | Cognitive
Track 7.
Containers | Microservices | Serverless
Track 8. FinTech | InsurTech | Token Economy

Cloud Expo | @ThingsExpo 2017 Silicon Valley
(October 31 - November 2, 2017, Santa Clara Convention Center, CA)

Cloud Expo | @ThingsExpo 2018 New York 
(June 12-14, 2018, Javits Center, Manhattan)

Download Show Prospectus ▸ Here

Every Global 2000 enterprise in the world is now integrating cloud computing in some form into its IT development and operations. Midsize and small businesses are also migrating to the cloud in increasing numbers.  

Companies are each developing their unique mix of cloud technologies and services, forming multi-cloud and hybrid cloud architectures and deployments across all major industries. Cloud-driven thinking has become the norm in financial services, manufacturing, telco, healthcare, transportation, energy, media, entertainment, retail and other consumer industries, and the public sector.

Cloud Expo is the single show where technology buyers and vendors can meet to experience and discus cloud computing and all that it entails. Sponsors of Cloud Expo will benefit from unmatched branding, profile building and lead generation opportunities through:

  • Featured on-site presentation and ongoing on-demand webcast exposure to a captive audience of industry decision-makers.
  • Showcase exhibition during our new extended dedicated expo hours
  • Breakout Session Priority scheduling for Sponsors that have been guaranteed a 35-minute technical session
  • Online advertising in SYS-CON's i-Technology Publications
  • Capitalize on our Comprehensive Marketing efforts leading up to the show with print mailings, e-newsletters and extensive online media coverage.
  • Unprecedented PR Coverage: Editorial Coverage on Cloud Computing Journal.
  • Tweetup to over 75,000 plus followers
  • Press releases sent on major wire services to over 500 industry analysts.

For more information on sponsorship, exhibit, and keynote opportunities, contact Carmen Gonzalez by email at events (at) sys-con.com, or by phone 201 802-3021.

The World's Largest "Cloud Digital Transformation" Event

@CloudExpo | @ThingsExpo 2017 Silicon Valley
(Oct. 31 - Nov. 2, 2017, Santa Clara Convention Center, CA)

@CloudExpo | @ThingsExpo 2018 New York 
(June 12-14, 2018, Javits Center, Manhattan)

Full Conference Registration Gold Pass and Exhibit Hall ▸ Here

Register For @CloudExpo ▸ Here via EventBrite

Register For @ThingsExpo ▸ Here via EventBrite

Register For @DevOpsSummit ▸ Here via EventBrite

Sponsorship Opportunities

Sponsors of Cloud Expo | @ThingsExpo will benefit from unmatched branding, profile building and lead generation opportunities through:

  • Featured on-site presentation and ongoing on-demand webcast exposure to a captive audience of industry decision-makers
  • Showcase exhibition during our new extended dedicated expo hours
  • Breakout Session Priority scheduling for Sponsors that have been guaranteed a 35 minute technical session
  • Online targeted advertising in SYS-CON's i-Technology Publications
  • Capitalize on our Comprehensive Marketing efforts leading up to the show with print mailings, e-newsletters and extensive online media coverage
  • Unprecedented Marketing Coverage: Editorial Coverage on ITweetup to over 100,000 plus followers, press releases sent on major wire services to over 500 industry analysts

For more information on sponsorship, exhibit, and keynote opportunities, contact Carmen Gonzalez (@GonzalezCarmen) today by email at events (at) sys-con.com, or by phone 201 802-3021.

Secrets of Sponsors and Exhibitors ▸ Here
Secrets of Cloud Expo Speakers ▸ Here

All major researchers estimate there will be tens of billions devices - computers, smartphones, tablets, and sensors - connected to the Internet by 2020. This number will continue to grow at a rapid pace for the next several decades.

With major technology companies and startups seriously embracing Cloud strategies, now is the perfect time to attend @CloudExpo@ThingsExpo, October 31 - November 2, 2017, at the Santa Clara Convention Center, CA, and June 12-4, 2018, at the Javits Center in New York City, NY, and learn what is going on, contribute to the discussions, and ensure that your enterprise is on the right path to Digital Transformation.

Delegates to Cloud Expo | @ThingsExpo will be able to attend 8 simultaneous, information-packed education tracks.

There are over 120 breakout sessions in all, with Keynotes, General Sessions, and Power Panels adding to three days of incredibly rich presentations and content.

Join Cloud Expo | @ThingsExpo conference chair Roger Strukhoff (@IoT2040), October 31 - November 2, 2017, Santa Clara Convention Center, CA, and June 12-14, 2018, at the Javits Center in New York City, NY, for three days of intense Enterprise Cloud and 'Digital Transformation' discussion and focus, including Big Data's indispensable role in IoT, Smart Grids and (IIoT) Industrial Internet of Things, Wearables and Consumer IoT, as well as (new) Digital Transformation in Vertical Markets.

Financial Technology - or FinTech - Is Now Part of the @CloudExpo Program!

Accordingly, attendees at the upcoming 21st Cloud Expo | @ThingsExpo October 31 - November 2, 2017, Santa Clara Convention Center, CA, and June 12-14, 2018, at the Javits Center in New York City, NY, will find fresh new content in a new track called FinTech, which will incorporate machine learning, artificial intelligence, deep learning, and blockchain into one track.

Financial enterprises in New York City, London, Singapore, and other world financial capitals are embracing a new generation of smart, automated FinTech that eliminates many cumbersome, slow, and expensive intermediate processes from their businesses.

FinTech brings efficiency as well as the ability to deliver new services and a much improved customer experience throughout the global financial services industry. FinTech is a natural fit with cloud computing, as new services are quickly developed, deployed, and scaled on public, private, and hybrid clouds.

More than US$20 billion in venture capital is being invested in FinTech this year. @CloudExpo is pleased to bring you the latest FinTech developments as an integral part of our program, starting at the 21st International Cloud Expo October 31 - November 2, 2017 in Silicon Valley, and June 12-14, 2018, in New York City.

@CloudExpo is accepting submissions for this new track, so please visit www.CloudComputingExpo.com for the latest information.

Speaking Opportunities

The upcoming 21st International @CloudExpo@ThingsExpo, October 31 - November 2, 2017, Santa Clara Convention Center, CA, and June 12-14, 2018, at the Javits Center in New York City, NY announces that its Call For Papers for speaking opportunities is open.

Submit your speaking proposal today! ▸ Here

About SYS-CON Media & Events
SYS-CON Media (www.sys-con.com) has since 1994 been connecting technology companies and customers through a comprehensive content stream - featuring over forty focused subject areas, from Cloud Computing to Web Security - interwoven with market-leading full-scale conferences produced by SYS-CON Events. The company's internationally recognized brands include among others Cloud Expo® (@CloudExpo), Big Data Expo® (@BigDataExpo), DevOps Summit (@DevOpsSummit), @ThingsExpo® (@ThingsExpo), Containers Expo (@ContainersExpo) and Microservices Expo (@MicroservicesE).

Cloud Expo®, Big Data Expo® and @ThingsExpo® are registered trademarks of Cloud Expo, Inc., a SYS-CON Events company.

More Stories By Angsuman Dutta

Angsuman Dutta is the CEO and founder of Pricchaa, a Big Data security company. He is a Data Management and Analytics expert. He has helped numerous Fortune 500 enterprises with Big Data Adoption solutions primarily in Healthcare and Banking. Angsuman earned a degree in engineering from the IIT, and an MBA from the University of Chicago.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.