Here are the key best practices that organizations need to adopt for securing their Big Data.
1. Secure your computation code:
- Proper access control, code signing, auditing should be implemented to secure computation code.
- Implement a strategy to protect data in presence of an untrusted computation code.
2. Implement comprehensive end-point input validation/filtering:
- Implement validation and filtering of input pertaining to all data sources, internal or external.
- Evaluate the input validation and filtering strategy and algorithms of your big data solution. Can it scale up to your data size requirements? In worst case, you need to develop custom algorithm to validate data.
(Read more: APT Secrets that Vendors Don't Tell)
3. Implement granular access control:
- Access control should be implemented at various layers of Big Data Architecture.
- Access control should define roles and privileges of external and internal users. Define more granular roles and privileges of internal Big data users consisting of Administrators, Developers, Knowledge Managers etc.
- Review permissions to execute Ad-hoc queries by end users or even internal users.
- Review Access control features of your Big Data solution. If necessary, implement access control in your application middleware.
- While using NoSql databases, enable security explicitly. In NoSql databases, security features like authentication, authorization, and encryption are disabled by default.
4. Secure your data storage and communication:
- Sensitive Data should be segregated
- Enable Data Encryption for Sensitive Data
- Administrative Data access
- Enable Auditing and Logging
- API Security: APIs access to Big Data solutions should be protected from unauthorized access. Proper access control should be implemented.
(Read more: CISO Guide for Denial-of-Service (DoS) Security)
5. Implement Privacy Preserving Data Mining and Analytics:
Sharing of Big Data Analytics should be verified against unintentional sensitive data disclosure.
Security risks in Big Data are twofold, traditional security risks and security risks specific to Big Data architecture and requirements. Hadoop is a traditional Big Data solution. In the past, it had virtually no security. However, currently a lot of security features are being developed for Hadoop. Same is the case with NoSQL databases. The only thing you need to keep in mind is that security features in all these solutions are disabled by default. Hence it is important for all the Big Data early adopters and innovators to know security risks of implementing Big Data in their enterprises.