Full Code of FallibleInc/security-guide-for-developers for AI

master 27db5c7c04da cached

9 files

53.7 KB

14.7k tokens

1 requests

Download .txt

Repository: FallibleInc/security-guide-for-developers
Branch: master
Commit: 27db5c7c04da
Files: 9
Total size: 53.7 KB

Directory structure:
gitextract_zywe575m/

├── .github/
│   └── FUNDING.yml
├── README-zh.md
├── README.md
├── https.md
├── security-checklist-zh.md
├── security-checklist.md
├── vulnerabilities-stats-zh.md
├── vulnerabilities-stats.md
└── what-can-go-wrong.md

================================================
FILE CONTENTS
================================================

================================================
FILE: .github/FUNDING.yml
================================================
# These are supported funding model platforms

github: FallibleInc


================================================
FILE: README-zh.md
================================================
# 实用性 WEB 开发人员安全须知  

### 目标读者  

安全问题主要由以下两类原因导致：   

1. 那些刚入门的无法区分 MD5 和 bcrypt 作用的开发者  
2. 那些知道这件事但忘记/忽略了的开发者  

我们的详细说明应该可以帮到第 1 类开发者，而我们希望我们的 checklist 可以帮到第 2 类的开发者构建更多的安全系统。这并不是一个综合性的全面指南，只是覆盖了大多数我们在过去发现的常见问题。  



### 目录  

1. [安全Checklist](security-checklist-zh.md)  
2. 什么东西会出问题?  
3. 安全地传输数据: HTTPS 详解  
4. 权限验证: 我是谁？  
4.1 基于表单的权限验证  
4.2 基础鉴权   
4.3 一次是不够的、二次、三次（验证）....   
4.4 为什么使用不安全的文本消息? HOTP & TOTP 介绍   
4.5 处理密码重置  
5. 权限验证: 我能做什么？  
5.1 基于 Token 的权限验证    
5.2 OAuth 和 OAuth2  
5.3 JWT（JSON Web Token）  
6. 数据校验和过滤: 绝不信任用户输入  
6.1 校验和过滤用户输入  
6.2 过滤输出  
6.3 跨站脚本攻击（XSS）    
6.4 注入攻击    
6.5 用户上传   
6.6 用户篡改输入  
7. 纯文本 != 编码 != 加密 != 哈希    
7.1 通用编码模式    
7.2 加密    
7.3 哈希和单向函数（功能）    
7.4 哈希速度对照表  
8. 密码: dadada、123456、cute@123  
8.1 密码策略  
8.2 密码存储  
8.3 没有密码的生活  
9. 公钥加密
10. 会话: 请记住我   
10.1 哪里存储状态？   
10.2 使会话失效    
10.3 Cookie 怪物和你  
11. 加固安全, 一次只有一个头信息    
11.1 安全的 web header    
11.2 第三方代码的数据集成检测    
11.3 证书绑定  
12. 配置错误      
12.1 云上准备: 端口、Shodan、AWS  
12.2 亲，你开了 debug 模式    
12.3 日志（或者没有日志）  
12.4 监控  
12.5 最低优先级原理    
12.6 （请求）频率限制 和 Captchas  
12.7 把项目的密钥和密码保存在文件上      
12.8 DNS: 关于子域名和被遗忘的宠物计划    
12.9 打补丁和更新    
13. 攻击: 当坏人来临    
13.1 点击劫持    
13.2 跨站请求伪造    
13.3 拒绝服务    
13.4 服务端请求伪造  
14. [互联网公司漏洞统计](vulnerabilities-stats-zh.md)   
15. 重造轮子，但做出来是方的    
15.1 Python 的安全库和包    
15.2 NodeJS 的安全库和包  
15.3 学习资料  
16. 掌握良好的安全习惯  
17. 安全性 vs 可用性  
18. 回到第 1 条: 安全 Checklist 解释  




### 我们是谁?

我们是全栈开发工程师，讨厌看到那些所谓为了做某件事情而 hack，但写了一堆不安全代码的开发者。在过去六个月，我们保护了超过 1500w 信用卡信息不被泄露，超过 4500w 的用户个人信息不被盗取，潜在的拯救了大量公司的倒闭。最近，我们发现的一个安全问题，可以导致一家比特币交易公司因数据泄露而倒闭。我们帮助了若干创业公司让他们的系统更安全，大多数都是免费的，有时候甚至连『谢谢』都没收到 :)

*如果你不同意我们的观点或者找到 bug，请开启一个 issue 或者提交一个 PR 给我们。另外，你也可以通过 hello@fallible.co 与我们交流。*


================================================
FILE: README.md
================================================
# A practical security guide for web developers (Work in progress)

### The intended audience

Security issues happen for two reasons - 

1. Developers who have just started and cannot really tell a difference between using MD5 or bcrypt.
2. Developers who know stuff but forget/ignore them.

Our detailed explanations should help the first type while we hope our checklist helps the second one create more secure systems. This is by no means a comprehensive guide, it just covers stuff based on the most common issues we have discovered in the past.


### Contents

1. [The Security Checklist](security-checklist.md)
2. [What can go wrong?](what-can-go-wrong.md)    
3. [Securely transporting stuff: HTTPS explained](https.md)
4. Authentication: I am who I say I am  
4.1 Form based authentication  
4.2 Basic authentication  
4.3 One is not enough, 2 factor, 3 factor, ....   
4.4 Why use insecure text messages? Introducing HOTP & TOTP   
4.5 Handling password resets
5. Authorization: What am I allowed to do?  
5.1 Token based Authorization  
5.2 OAuth & OAuth2  
5.3 JWT
6. Data Validation and Sanitation: Never trust user input  
6.1 Validating and Sanitizing Inputs  
6.2 Sanitizing Outputs  
6.3 Cross Site Scripting  
6.4 Injection Attacks  
6.5 User uploads  
6.6 Tamper-proof user inputs
7. Plaintext != Encoding != Encryption != Hashing  
7.1 Common encoding schemes  
7.2 Encryption  
7.3 Hashing & One way functions  
7.4 Hashing speeds cheatsheet
8. Passwords: dadada, 123456 and cute@123  
8.1 Password policies  
8.2 Storing passwords  
8.3 Life without passwords
9. Public Key Cryptography
10. Sessions: Remember me, please  
10.1 Where to save state?  
10.2 Invalidating sessions  
10.3 Cookie monster & you
11. Fixing security, one header at a time  
11.1 Secure web headers  
11.2 Data integrity check for 3rd party code  
11.3 Certificate Pinning
12. Configuration mistakes    
12.1 Provisioning in cloud: Ports, Shodan & AWS  
12.2 Honey, you left the debug mode on  
12.3 Logging (or not logging)  
12.4 Monitoring  
12.5 Principle of least privilege  
12.6 Rate limiting & Captchas  
12.7 Storing project secrets and passwords in a file    
12.8 DNS: Of subdomains and forgotten pet-projects  
12.9 Patching & Updates  
13. Attacks: When the bad guys arrive  
13.1 Clickjacking  
13.2 Cross Site Request Forgery  
13.3 Denial of Service  
13.4 Server Side Request Forgery
14. [Stats about vulnerabilities discovered in Internet Companies](vulnerabilities-stats.md)   
15. On reinventing the wheel, and making it square  
15.1 Security libraries and packages for Python  
15.2 Security libraries and packages for Node/JS  
15.3 Learning resources
16. Maintaining a good security hygiene
17. Security Vs Usability
18. Back to Square 1: The Security Checklist explained




### Who are we?

We are full stack developers who just grew tired of watching how developers were lowering the barrier to call something a hack by writing unsecure code. In the past six months, we have prevented leaks of more than 15 million credit card details, personal details of over 45 million users and potentially saved companies from shutting down. Recently, we discovered an issue that could result in system takeover and data leak in a bitcoin institution. We have helped several startups secure their systems, most of them for free, sometimes without even getting a thank you in response :)


*If you disagree with something or find a bug please open an issue or file a PR. Alternatively, you can talk to us on hello@fallible.co*


================================================
FILE: https.md
================================================
# Securely transporting stuff: HTTPS explained


## The problem
HTTP is the protocol that the browsers use to communicate with the server. The problem with HTTP without any S is that it sends and receives data in plain text. 

#### Well, who can see my data in plain text?

Well, anyone in your local network, your co-workers for example or people sitting around you in your favourite cafe. 

#### How will they do it?

Since the data is in plain text, they can just tell the [switch](https://en.wikipedia.org/wiki/Network_switch) to deliver packets to their machine instead of yours by [ARP poisioning](https://en.wikipedia.org/wiki/ARP_spoofing) the ARP table maintained by the `switch` :
![ARP poisioning](/images/arp.png)

Also, the owner of the cafe or your boss in your office can see your data by programming the hub/switch easily since they own and have physical access to it or [wire tapping](https://en.wikipedia.org/wiki/Fiber_tapping) the wire itself coming in to the cafe.

**Bad HTTP!**


## Enters HTTPS

![https](/images/https.gif) 

The 'S' in HTTPS stands for Secure i.e. if you are visiting any website on the internet that has the protocol `https` in the URI, then it is most likely secure. No one in the `middle` can sniff your traffic.

### How does it work?
HTTPS encrypts all the data that gets transferred between the browser and the server. The server and the browser uses a symmetric key known to both of them to encrypt the data. The process by which they arrive at the common key is called [TLS handshake](https://en.wikipedia.org/wiki/Transport_Layer_Security#TLS_handshake). In simple terms, the server sends its `public key` along with `domain name` embedded in a `certificate` to the browser, the browser sends back a `pre-master secret key` encyrpted using the server's public key. The server decrypts the encrypted message using its private key to obtain the pre-master secret key. Both the browser and the server now converts the pre-master key into the `master secret key` which is eventually used for encryption of all the future communications between server and the browser.

![Encryption](/images/encryption.png)

There is still one problem with the above process, that is, any [man in the middle](https://en.wikipedia.org/wiki/Man-in-the-middle_attack) can also generate a certificate and pretend to be the origin server and send malicious content to the browser. 

To solve that problem browser like Chrome, Firefox, Safari etc. come embedded with information to find out which certificates are genuine. Browsers look for signature in the certificate, the signature on the certificate needs to be from one of the trusted [certificate authorities](https://en.wikipedia.org/wiki/Certificate_authority). In simple terms, certificate authorities are certain well-known organisations which everyone knows to be trust worthy (it all boils down to trust). If there is no such signature in the certificate then the browser will display a warning to the user that this connection is not really HTTPS. The server on the other hand need to get the signed certificate from one of the certificate authority by physically verifying their identity(by sending docs etc.).

So, `https` servers two main purpose 

	* It tells you that the website domain shown in the browser is the one you are actually talking to.
	* It encrypts all the communication between the domain in the browser and the browser itself.
	
### How to get HTTPS for my website?
#### There are two ways to get HTTPS to your website
1. Paid 
	* You need to buy a SSL certificate from some CAs 
	* Then you need to generate a certificate signing request from your server
	* Then they ask you to verify that you really own the domain.
	* Then they let you download the signed certificate which you can use in your server's configuration.
2. Free: 
	* Use [LetsEncrypt](https://letsencrypt.org/). Letsencrypt is free because the whole process is totally automated hence getting rid of the manual cost of configuration, creation, validation, expiration etc. 
	* To setup, follow the steps mentioned here depending on your server: [Setup steps](https://certbot.eff.org/#ubuntuxenial-nginx)
	

#### Best practices for https configuration, examples are for [nginx](https://www.nginx.com/) but settings for apache and others are available too ([ssl config generator](https://mozilla.github.io/server-side-tls/ssl-config-generator/))
- [ ] regularly update/patch [openssl](https://www.openssl.org/source/) to the latest version available because that will protect you from bugs like [heartbleed](https://en.wikipedia.org/wiki/Heartbleed) and [many more](https://www.openssl.org/news/secadv/20160503.txt).
- [ ] add this flag in nginx server conf for server-side protection from [BEAST attacks](https://en.wikipedia.org/wiki/Transport_Layer_Security#BEAST_attack)
       ```
	ssl_prefer_server_ciphers on;`

	ssl_ciphers "ECDHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES128-SHA256:ECDHE-RSA-AES256-SHA:ECDHE-RSA-AES128-SHA:DHE-RSA-AES256-SHA256:DHE-RSA-AES128-SHA256:DHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA:ECDHE-RSA-DES-CBC3-SHA:EDH-RSA-DES-CBC3-SHA:AES256-GCM-SHA384:AES128-GCM-SHA256:AES256-SHA256:AES128-SHA256:AES256-SHA:AES128-SHA:DES-CBC3-SHA:HIGH:!aNULL:!eNULL:!EXPORT:!DES:!MD5:!PSK:!RC4"; #Disables all weak ciphers
       ```

- [ ] Older versions of ssl protocols have been found to have multiple severe vulnerabilities (ex: [POODLE attack](https://en.wikipedia.org/wiki/POODLE), [DROWN attack](https://en.wikipedia.org/wiki/DROWN_attack)), so support only TLSv1.1 and TLSv1.2. Do not support sslv2 and sslv3. Do [check the adoption](https://en.wikipedia.org/wiki/Transport_Layer_Security#Web_browsers) to know the trade off of restricting to these versions of TLS.
       ```
	ssl_protocols TLSv1.1 TLSv1.2;
	```

- [ ] Default Diffie-Hellman parameter used by nginx is only 1024 bits which is considered not so secure. Also, it is same for all nginx users who use the default config. It is estimated that an academic team can break 768-bit primes and that a nation-state could break a 1024-bit prime. By breaking one 1024-bit prime, one could eavesdrop on 18 percent of the top one million HTTPS domains, so do not use the default DH parameter, locally generate the parameter for more security, also use higher number of bits.
	```shell
	$ cd /etc/ssl/certs
	$ openssl dhparam -out dhparam.pem 4096
	```
       
       ```
	ssl_dhparam /etc/nginx/ssl/dhparam.pem;
       ```
       
- [ ] config to enable HSTS(HTTP Strict Transport Security) to avoid [ssl stripping](https://en.wikipedia.org/wiki/SSL_stripping#SSL_stripping). This should not be a problem if ALL, yes, if ALL traffic is redirected to https
       ```
	add_header Strict-Transport-Security "max-age=31536000; includeSubdomains;";
       ```

## Certificate Pinning for apps (and website)
#### What's this now?
In general any user who has an access to the app can see all the API calls even if it HTTPS. To do that he creates a certificate authority and tells the device (Android / iOS) to trust it. Now when you connect to the server it sits in between the server and the app and replaces your server's certificate with the one generated `on the fly` with its certificate (having own public/private `key` pair) signed by his own certificate authority and now he can sit in the middle and act as server for the mobile client and act as client for the server. Sneaky.

#### Wait! Isn't HTTPS supposed to prevent that?
Yes, but HTTPS can only help you when the trusted certificate authorities are actually trust worthy. In this case, the user forced the device to trust his own created certificate authority! 

#### So, how do I prevent that?
Certificate pinning - Basically, in your app bundle, hard code the server certificate and before doing any API call check whether the server is really using that same hardcoded certificate or someone tried to sneak in his own certificate.

#### Caution
* In case the certificate changes on the server side you will have to force the users to update the app else the app will stop working.
* If you mess up the certificate pinning, you will have to ask users to update the app else the app will stop working.

#### A better way!
Certificate pinning is a good way to prevent this but there is one better way to ensure no one can snoop in - use `public key pinning`. Generally sites like Google rotates its certificate so you will have to force users to update your app. Instead what you should pin in your app is the `public key` which remains static even when Google rotates its certificate hence not needing any app update. This is called `Public key Pinning`.

* Android and iOS sample code examples: 
```
https://www.paypal-engineering.com/2015/10/14/key-pinning-in-mobile-applications/
```

## Precautions for general public
* When you visit a website in your browser, make sure it displays the padlock like this ![padlock](/img/padlock.png) (will be gray in safari)
* If you are using an untrusted or public internet(wifi/wired) and you see striked out padlock and a warning page, then do not proceed, someone might be snooping on your traffic.
* iOS and Android apps have no way to tell if they are encrypting the traffic. Bad luck.
* Do not hand over your unloked mobile phones to any untrusted person. He/she might install certain untrusted `CAs` (certificate authorities) and can see all your traffic.
* If you use a mobile phone or laptop provided by the company then they might have installed certain `CAs` (certificates authorities) to be trusted by the device and can easily snoop on all your browsing. You should check if any `CA` is installed in your phone. Steps to check: In iOS, go to `Settings` -> `General` -> `Profiles`. If there is anything installed there then someone might be sniffing your traffic. In Android, go to `Settings`, under "Personal," tap `Security`, under "Credential storage," tap `Trusted credentials`. Check the certificates installed by user and system.


## Future of HTTPS
Web was built on HTTP protocol which lacks the security bit. Slowly people started to feel the need to have the channel secured, so that led to the birth of HTTPS. Still as of today majority of the websites are HTTP since thats the `default protocol`. If one needs to get HTTPS they use one of the methods mentioned in the section above "how to get https for my website". 

It would be awesome if all the websites use `https` instead of `http`. Also, all the browsers should force https, meaning they should fail the request if it is not `https`. Currently this is implemented using `HSTS` preload list but that is optional for websites to opt in but it would be nice if all the websites were forced to be https. This would improve the security of end users. There are lot of people promoting the move to https everywhere. 

But there is a problem with upgrading to https, that is, if some website was previously linked as http and now only works with https then that `http link` will break (as the links to this site would not get updated by the linker website). There are plugins to use [HTTPS everywhere](https://www.eff.org/Https-everywhere) which forces all the communication to be on `https://` if possible. But a better [proposal](https://www.w3.org/DesignIssues/Security-NotTheS.html) is to do HTTPS everywhere in the sense of the protocol but not the URI prefix - in that we do not need two different prefixes `http` and `https`, just make http use TLS fundamentally.


================================================
FILE: security-checklist-zh.md
================================================
[返回目录](README-zh.md)


### 安全checklist   

##### 权限系统 (注册/注册/二次验证/密码重置)
- [ ] 任何地方都使用 HTTPS.
- [ ] 使用 `Bcrypt` 存储密码哈希 (没有使用盐的必要 - `Bcrypt` 干的就是这个事).
- [ ] `登出`之后销毁会话 ID .  
- [ ] 密码重置后销毁所有活跃的会话.  
- [ ] OAuth2 验证必须包含 `state` 参数.
- [ ] 登陆成功之后不能直接重定向到开放的路径（需要校验，否则容易存在钓鱼攻击）.
- [ ] 当解析用户注册/登陆的输入时，过滤 javascript://、 data:// 以及其他 CRLF 字符.
- [ ] 使用 secure/httpOnly cookies.
- [ ] 移动端使用 `OTP` 验证时，当调用 `generate OTP` 或者 `Resend OTP` API 时不能把 OTP（One Time Password） 直接返回。（一般是通过发送手机验证短信，邮箱随机 code 等方式，而不是直接 response）  
- [ ] 限制单个用户 `Login`、`Verify OTP`、 `Resend OTP`、`generate OTP` 等 API 的调用次数，使用 Captcha 等手段防止暴力破解.  
- [ ] 检查邮件或短信里的重置密码的 token，确保随机性（无法猜测）  
- [ ] 给重置密码的 token 设置过期时间.
- [ ] 重置密码成功后，将重置使用的 token 失效.


##### 用户数据和权限校验  
- [ ] 诸如`我的购物车`、`我的浏览历史`之类的资源访问，必须检查当前登录的用户是否有这些资源的访问权限.
- [ ] 避免资源 ID 被连续遍历访问，使用 `/me/orders` 代替 `/user/37153/orders` 以防你忘了检查权限，导致数据泄露。   
- [ ] `修改邮箱/手机号码`功能必须首先确认用户已经验证过邮箱/手机是他自己的。  
- [ ] 任何上传功能应该过滤用户上传的文件名，另外，为了普适性的原因（而不是安全问题），上传的东西应该存放到例如 S3 之类的云存储上面(用 lambda 处理)，而不是存储在自己的服务器，防止代码执行。  
- [ ] `个人头像上传` 功能应该过滤所有的 `EXIF` 标签，即便没有这个需求.  
- [ ] 用户 ID 或者其他的 ID，应该使用 [RFC compliant ](http://www.ietf.org/rfc/rfc4122.txt) 的 `UUID` 而不是整数. 你可以从 github 找到你所用的语言的实现.  
- [ ] [JWT（JSON Web Token）](https://jwt.io/)很棒.当你需要构建一个 单页应用/API 时使用.  


##### 安卓和 iOS APP
- [ ] 支付网关的 `盐（salt）` 不应该被硬编码  
- [ ] 来自第三方的 `secret` 和 `auth token` 不应该被硬编码  
- [ ] 在服务器之间调用的 API 不应该在 app 里面调用  
- [ ] 在安卓系统下，要小心评估所有申请的 [权限](https://developer.android.com/guide/topics/security/permissions.html)   
- [ ] 在 iOS 系统下，使用系统的钥匙串来存储敏感信息（权限 token、api key、 等等） __不要__ 把这类信息存储在用户配置里面  
- [ ] 强烈推荐[证书绑定（Certificate pinning）](https://en.wikipedia.org/wiki/HTTP_Public_Key_Pinning)   


##### 安全头信息和配置  
- [ ] `添加` [CSP](https://en.wikipedia.org/wiki/Content_Security_Policy) 头信息，减缓 XSS 和数据注入攻击. 这很重要.  
- [ ] `添加` [CSRF](https://en.wikipedia.org/wiki/Cross-site_request_forgery) 头信息防止跨站请求伪造（CSRF）攻击.同时`添加` [SameSite](https://tools.ietf.org/html/draft-ietf-httpbis-cookie-same-site-00) 属性到 cookie 里面.  
- [ ] `添加` [HSTS](https://en.wikipedia.org/wiki/HTTP_Strict_Transport_Security) 头信息防止 SSL stripping 攻击.
- [ ] `添加` 你的域名到 [HSTS 预加载列表](https://hstspreload.appspot.com/)
- [ ] `添加` [X-Frame-Options](https://en.wikipedia.org/wiki/Clickjacking#X-Frame-Options) 防止点击劫持.  
- [ ] `添加` [X-XSS-Protection](https://www.owasp.org/index.php/OWASP_Secure_Headers_Project#X-XSS-Protection) 缓解 XSS 攻击.  
- [ ] `更新` DNS 记录，增加 [SPF](https://en.wikipedia.org/wiki/Sender_Policy_Framework) 记录防止垃圾邮件和钓鱼攻击.  
- [ ] 如果你的 Javascript 托管在第三方的 CDN 上面，需要`添加` [内部资源集成检查](https://en.wikipedia.org/wiki/Subresource_Integrity) 。为了更加安全，添加[require-sri-for](https://w3c.github.io/webappsec-subresource-integrity/#parse-require-sri-for) CSP-directive 就不会加载到没有 SRI 的资源  
- [ ] 使用随机的 CSRF token，业务逻辑 API 可以暴露为 POST 请求。不要把 CSRF token 通过 http 接口暴露出来，比如第一次请求更新的时候  
- [ ] 在 get 请求参数里面，不要使用临界数据和 token。 暴露服务器日志的同时也会暴露用户数据


##### 过滤输入  
- [ ] 所有暴露给用户的参数输入都应该 `过滤` 防止 [XSS](https://en.wikipedia.org/wiki/Cross-site_scripting) 攻击.
- [ ] 使用参数化的查询防止 [SQL 注入](https://en.wikipedia.org/wiki/SQL_injection).  
- [ ] 过滤所有具有功能性的用户输入，比如 `CSV导入`    
- [ ] `过滤`一些特殊的用户输入，例如将 robots.txt 作为用户名，而你刚好提供了 coolcorp.io/username 之类的 url 来提供用户信息访问页面。（此时变成 coolcorp.io/robots.txt，可能无法正常工作）  
- [ ] 不要自己手动拼装 JSON 字符串，不管这个对象有多么小。请使用你所用的语言相应的库或者框架来编写
- [ ] `过滤` 那些有点像 URL 的输入，防止 [SSRF](https://docs.google.com/document/d/1v1TkWZtrhzRLy0bYXBcdLUedXGb9njTNIJXa3u9akHM/edit#heading=h.t4tsk5ixehdd) 攻击  
- [ ] 在输出显示给用户之前，`过滤`输出信息

##### 操作  
- [ ] 如果你的业务很小或者你缺乏经验，可以评估一下使用 AWS 或者一个 PaaS 平台来运行代码  
- [ ] 在云上使用正规的脚本创建虚拟机  
- [ ] 检查所有机器没有必要开放的`端口`  
- [ ] 检查数据库是否没有设置密码或者使用默认密码，特别是 MongoDB 和 Redis  
- [ ] 使用 SSH 登录你的机器，不要使用密码，而是通过 SSH key 验证来登录  
- [ ] 及时更新系统，防止出现 0day 漏洞，比如 Heartbleed、Shellshock 等  
- [ ] 修改服务器配置，HTTPS 使用 TLS1.2，禁用其他的模式。(值得这么做)
- [ ] 不要在线上开启 DEBUG 模式，有些框架，DEBUG 模式会开启很多权限以及后门，或者是暴露一些敏感数据到错误栈信息里面  
- [ ] 对坏人和 DDOS 攻击要有所准备，使用那些提供 DDOS 清洗的主机服务  
- [ ] 监控你的系统，同时记录到日志里面 (例如使用 [New Relic](https://newrelic.com/) 或者其他 ).
- [ ] 如果是 2B 的业务，坚持顺从需求。如果使用 AWS S3,可以考虑使用 [数据加密](http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingServerSideEncryption.html) 功能. 如果使用 AWS EC2，考虑使用磁盘加密功能（现在系统启动盘也能加密了）  

##### 关于人  
- [ ] 开一个邮件组（例如：security@coolcorp.io）和搜集页面，方便安全研究人员提交漏洞  
- [ ] 取决于你的业务，限制用户数据库的访问  
- [ ] 对报告 bug、漏洞的人有礼貌
- [ ] 把你的代码给那些有安全编码观念的同伴进行 review (More eyes)
- [ ] 被黑或者数据泄露时，检查数据访问前的日志，通知用户更改密码。你可能需要外部的机构来帮助审计  
- [ ] 使用 [Netflix Scumblr](https://github.com/Netflix/Scumblr) 及时了解你的组织（公司）在社交网络或者搜索引擎上的一些讨论信息，比如黑客攻击、漏洞等等


================================================
FILE: security-checklist.md
================================================
[Back to Contents](README.md)


### The Security Checklist 

##### AUTHENTICATION SYSTEMS (Signup/Signin/2 Factor/Password reset) 
- [ ] Use HTTPS everywhere.
- [ ] Store password hashes using `Bcrypt` (no salt necessary - `Bcrypt` does it for you).
- [ ] Destroy the session identifier after `logout`.  
- [ ] Destroy all active sessions on reset password (or offer to).  
- [ ] Must have the `state` parameter in OAuth2.
- [ ] No open redirects after successful login or in any other intermediate redirects.
- [ ] When parsing Signup/Login input, sanitize for javascript://, data://, CRLF characters. 
- [ ] Set secure, httpOnly cookies.
- [ ] In Mobile `OTP` based mobile verification, do not send the OTP back in the response when `generate OTP` or `Resend OTP`  API is called.
- [ ] Limit attempts to `Login`, `Verify OTP`, `Resend OTP` and `generate OTP` APIs for a particular user. Have an exponential backoff set or/and something like a captcha based challenge.
- [ ] Check for randomness of reset password token in the emailed link or SMS.
- [ ] Set an expiration on the reset password token for a reasonable period.
- [ ] Expire the reset token after it has been successfully used.


##### USER DATA & AUTHORIZATION
- [ ] Any resource access like, `my cart`, `my history` should check the logged in user's ownership of the resource using session id.
- [ ] Serially iterable resource id should be avoided. Use `/me/orders` instead of `/user/37153/orders`. This acts as a sanity check in case you forgot to check for authorization token. 
- [ ] `Edit email/phone number` feature should be accompanied by a verification email to the owner of the account. 
- [ ] Any upload feature should sanitize the filename provided by the user. Also, for generally reasons apart from security, upload to something like S3 (and post-process using lambda) and not your own server capable of executing code.  
- [ ] `Profile photo upload` feature should sanitize all the `EXIF` tags also if not required.
- [ ] For user ids and other ids, use [RFC compliant ](http://www.ietf.org/rfc/rfc4122.txt) `UUID` instead of integers. You can find an implementation for this for your language on Github.
- [ ] JWT are awesome. Use them if required for your single page app/APIs.


##### ANDROID / IOS APP
- [ ] `salt` from payment gateways should not be hardcoded.
- [ ] `secret` / `auth token` from 3rd party SDK's should not be hardcoded.
- [ ] API calls intended to be done `server to server` should not be done from the app.
- [ ] In Android, all the granted  [permissions](https://developer.android.com/guide/topics/security/permissions.html) should be carefully evaluated.
- [ ] On iOS, store sensitive information (authentication tokens, API keys, etc.) in the system keychain. Do __not__ store this kind of information in the user defaults.
- [ ] [Certificate pinning](https://en.wikipedia.org/wiki/HTTP_Public_Key_Pinning) is highly recommended.


##### SECURITY HEADERS & CONFIGURATIONS
- [ ] `Add` [CSP](https://en.wikipedia.org/wiki/Content_Security_Policy) header to mitigate XSS and data injection attacks. This is important.
- [ ] `Add` [CSRF](https://en.wikipedia.org/wiki/Cross-site_request_forgery) header to prevent cross site request forgery. Also add [SameSite](https://tools.ietf.org/html/draft-ietf-httpbis-cookie-same-site-00) attributes on cookies.
- [ ] `Add` [HSTS](https://en.wikipedia.org/wiki/HTTP_Strict_Transport_Security) header to prevent SSL stripping attack.
- [ ] `Add` your domain to the [HSTS Preload List](https://hstspreload.org/)
- [ ] `Add` [X-Frame-Options](https://en.wikipedia.org/wiki/Clickjacking#X-Frame-Options) to protect against Clickjacking.
- [ ] `Add` [X-XSS-Protection](https://www.owasp.org/index.php/OWASP_Secure_Headers_Project#X-XSS-Protection) header to mitigate XSS attacks.
- [ ] Update DNS records to add [SPF](https://en.wikipedia.org/wiki/Sender_Policy_Framework) record to mitigate spam and phishing attacks.
- [ ] Add [subresource integrity checks](https://en.wikipedia.org/wiki/Subresource_Integrity) if loading your JavaScript libraries from a third party CDN. For extra security, add the [require-sri-for](https://w3c.github.io/webappsec-subresource-integrity/#parse-require-sri-for) CSP-directive so you don't load resources that don't have an SRI sat.  
- [ ] Use random CSRF tokens and expose business logic APIs as HTTP POST requests. Do not expose CSRF tokens over HTTP for example in an initial request upgrade phase.
- [ ] Do not use critical data or tokens in GET request parameters. Exposure of server logs or a machine/stack processing them would expose user data in turn.  
  
  
##### SANITIZATION OF INPUT
- [ ] `Sanitize` all user inputs or any input parameters exposed to user to prevent [XSS](https://en.wikipedia.org/wiki/Cross-site_scripting).
- [ ] Always use parameterized queries to prevent [SQL Injection](https://en.wikipedia.org/wiki/SQL_injection).  
- [ ] Sanitize user input if using it directly for functionalities like CSV import.
- [ ] `Sanitize` user input for special cases like robots.txt as profile names in case you are using a url pattern like coolcorp.io/username. 
- [ ] Do not hand code or build JSON by string concatenation ever, no matter how small the object is. Use your language defined libraries or framework.
- [ ] Sanitize inputs that take some sort of URLs to prevent [SSRF](https://docs.google.com/document/d/1v1TkWZtrhzRLy0bYXBcdLUedXGb9njTNIJXa3u9akHM/edit#heading=h.t4tsk5ixehdd).
- [ ] Sanitize Outputs before displaying to users.

##### OPERATIONS
- [ ] If you are small and inexperienced, evaluate using AWS elasticbeanstalk or a PaaS to run your code.
- [ ] Use a decent provisioning script to create VMs in the cloud.
- [ ] Check for machines with unwanted publicly `open ports`.
- [ ] Check for no/default passwords for `databases` especially MongoDB & Redis.
- [ ] Use SSH to access your machines; do not setup a password, use SSH key-based authentication instead.
- [ ] Install updates timely to act upon zero day vulnerabilities like Heartbleed, Shellshock.
- [ ] Modify server config to use TLS 1.2 for HTTPS and disable all other schemes. (The tradeoff is good.)
- [ ] Do not leave the DEBUG mode on. In some frameworks, DEBUG mode can give access full-fledged REPL or shells or expose critical data in error messages stacktraces.
- [ ] Be prepared for bad actors & DDOS - use a hosting service that has DDOS mitigation.
- [ ] Set up monitoring for your systems, and log stuff (use [New Relic](https://newrelic.com/) or something like that).
- [ ] If developing for enterprise customers, adhere to compliance requirements. If AWS S3, consider using the feature to [encrypt data](http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingServerSideEncryption.html). If using AWS EC2, consider using the feature to use encrypted volumes (even boot volumes can be encrypted now).

##### PEOPLE
- [ ] Set up an email (e.g. security@coolcorp.io) and a page for security researchers to report vulnerabilities.
- [ ] Depending on what you are making, limit access to your user databases.
- [ ] Be polite to bug reporters.
- [ ] Have your code review done by a fellow developer from a secure coding perspective. (More eyes)
- [ ] In case of a hack or data breach, check previous logs for data access, ask people to change passwords. You might require an audit by external agencies depending on where you are incorporated.  
- [ ] Set up [Netflix's Scumblr](https://github.com/Netflix/Scumblr) to hear about talks about your organization on social platforms and Google search.


================================================
FILE: vulnerabilities-stats-zh.md
================================================
[返回目录](README-zh.md)


### Hackerone 公开漏洞统计  

目前为止，Hackerone 平台已经发现 1731 个公开的漏洞，主要来自 Twitter、Uber、Dropbox、Github 等公司。其中 8 个已经删除，9 个来自互联网或者特定的语言，剩下的 1714 个中，有 1359 个我们可以通过代码或者人工的方式进行分类。  


#### 按照错误的类型划分  


| 类型 | 数量 | 占比 |
| --- | --- |  --- |
| 用户输入过滤        | 481      | 27.8
| 其他代码问题              | 549      | 31.7
| 配置问题           | 325      | 18.8
| 无法归类+信息+垃圾         | 376      | 21.7


#### 按照发生的频率排序  

其中 1/3 的问题与 XSS、不安全的数据引用 (数据泄露) 或者忘记设置 CSRF token 有关，这个 [页面](https://hackerone.com/hacktivity/new) 列举的这些问题非常有趣，值得一读.  

类型|数量|占比
| --- | --- | --- |
XSS|375|21.87
非安全引用 + 数据泄露|104|6.06
CSRF Token|99|5.77
开放重定向|59|3.44
信息/源代码泄露|57|3.32
DNS 配置错误 + Apache/Nginx + 子域名接管 + Open AWS_S3|44|2.56
不正确的 session 管理/固定|39|2.27
TLS/SSL/POODLE/Heartbleed|39|2.27
HTML/JS/XXE/内容注入|37|2.15
HTTP 头信息问题|34|1.98
空指针 + 段错误 + 在 free() 之后使用内存|33|1.92
DMARC/DKIM/邮件 SPF 设置|31|1.8
SQL 注入|28|1.63
点击劫持|27|1.57
不正确的 cookie 使用 (secure/httpOnly/暴露)|25|1.45
路径暴露|25|1.45
开放权限|24|1.4
暴力破解|24|1.4
内容欺诈|20|1.16
缓冲区溢出|20|1.16
拒绝服务|19|1.1
服务端请求伪造|18|1.05
Adobe Flash 漏洞|18|1.05
用户/信息 枚举|17|0.99
远程代码执行|15|0.87
密码重置 token 过期/尝试/其他|13|0.75
整型溢出|11|0.64
版本泄露|11|0.64
CSV 注入|10|0.58
权限放大|9|0.52
OAuth 状态/泄露和其他问题|9|0.52
密码策略|7|0.4
CRLF|7|0.4
python 语言|6|0.35
单向攻击|6|0.35
文件上传类型/大小/存储位置 过滤|6|0.35
Captcha|5|0.29
远程/本地 文件包含|4|0.23
目录列表|4|0.23
路径遍历|4|0.23
远程文件上传|4|0.23
（WEB表单）开启自动填充|4|0.23
通过引用泄露|3|0.17
Pixel Flood Attack|3|0.17
输入控制字符|2|0.11


### 一些特殊的漏洞类型

1. 竞态条件漏洞
2. Pixel Flood Attack
3. IDN Homograph Attack
4. 输入控制字符后输入一些有趣的东西


================================================
FILE: vulnerabilities-stats.md
================================================
[Back to Contents](README.md)


### Hackerone publicly disclosed bugs Stats

Updated analysis of HackerOne vulnerability reports shows 12,618 total issues analyzed from the dataset.
All 12,618 issues were successfully classified using automated parsing and categorization.

    
    

#### Issues by type of mistake


| Classification | Count | Percentage |
| --- | --- |  --- |
| User Input Sanitization        | 4267     | 33.8
| Unclassified+Info+Junk         | 4066     | 32.2
| Other code issues              | 3350     | 26.5
| Configuration issues           | 935      | 7.4


#### Issues sorted by their frequency of occurrence

1 out of 3 issues were related to XSS, Information disclosure, or other code issues. The [Hackerone page](https://hackerone.com/hacktivity/new) listing these issues is quite interesting and can be read.

Type|Count|Percentage
| --- | --- | --- |
Other code issues|2599|20.60
XSS|2168|17.18
Information/Source Code Disclosure|1521|12.05
Unclassified+Info+Junk|1467|11.63
Broken/Open Authentication|868|6.88
SQL Injection|597|4.73
CSRF Token|468|3.71
Denial Of Service|458|3.63
Privilege Escalation|389|3.08
NULL POINTER + SEGFAULT + Using memory after free()|307|2.43
HTML/JS/XXE/Content Injections|299|2.37
Open Redirects|292|2.31
Insecure reference + Data Leak|263|2.08
Server Side Request Forgery|236|1.87
Path traversal|207|1.64
Buffer overflow|163|1.29
Clickjacking|129|1.02
Password Policy|67|0.53
Remote Code Execution|58|0.46
Improper Session management/Fixation|48|0.38
Integer overflow|13|0.10
Brute Force attacks|1|0.01


### Some unique vulnerability types

1. Race conditions based vulnerabilities
2. Pixel Flood Attack
3. IDN Homograph Attack
4. Control Characters in Input leading to interesting outcomes


================================================
FILE: what-can-go-wrong.md
================================================
A recently released information about an old data beach of more than 500 million Yahoo users has put question mark on the multi-billion dollars sale of Yahoo to Verizon. After all, no-one wants to acquire a liability and not account for all future costs involved. Online dating app for spouse cheaters, Ashley Madison had to shutdown after a breach exposed its user information and billing related info leading to many real life divorces, resignations and [suicides including that of a pastor](http://money.cnn.com/2015/09/08/technology/ashley-madison-suicide/index.html). 600,000 Facebook accounts were hacked daily in 2011, maybe more today. The massive SONY pictures hack led to exposure of a lot of sensitive information regarding its projects, personal details of key executives and Hollywood bitching. It also exposed social security numbers and passwords of its employees and around 1 million users, allegedly all stored in a simple excel file clearly marked as 'Passwords.xls'. The famous fappening saga exposed nude pictures of a lot of Hollywood celebrities happened when a hacker wrote a [script to test celebrity Apple accounts for the top 500 most popular passwords that were approved by Apple's password policy](https://medium.com/@ryandemattia/what-we-should-learn-from-the-fappening-a-lesson-in-security-design-96e49f7eaee9#.nfm9z8637) (Caps, special chars and all). Much recently, an attack on a key DNS service provider Dyn brought down almost half of United State's internet and major websites like The New York Times, Twitter, etc.

Where do things go wrong? Most of the exploits are human errors that lead to an opportunity and was attacked upon by a malicious hacker. Let us go back to the the typical flow of an App and interaction pattern of the App and its users. A user sitting in a cafe is browsing Product Hunt and comes across a new dating app, that promises to provide meaningful relationships. You install the app, signup using one of the two passwords you use everywhere on the web and if you are a developer you might wonder why there is no padlock like indicator for mobile apps to ensure that the communication between you and the dating app's server is secure and cannot be snooped upon. You start using the app, carefully swiping and writing witty messages you learnt on a reddit megathread but all the girls seem to be offline. Suddenly you get a premium subscription offer that will show you more frequently to girls than other straight males. You quickly add your credit card for the $5 subscription and then close the app. The dating app, as with all dating apps with above average UX, gets a lot of users quickly, and a lot of press for their focus on meaningful relationships and then a ton of money from prominent people in the valley for changing the world. We shall not focus on these issues, we just want to know where things can go wrong.

Remember you were sitting in a coffee shop. What if in a hurry to release their app, the developers are taking signup information on a non-secure channel? This means anyone sitting in the cafe can listen to and intercept your password, which you share across half the services on the internet. This rarely happens today. Let us move to another scenario. What if the wi-fi you connected to with the name Cafe_Noir is not the Cafe's wifi but a fake one to intercept your communications. Remember that your Mobile phone or laptop saves wifi network and their passwords so that when you revisit a cafe, it automatically logs in without you having to enter the password again. [With a small cheap setup, anyone can get started stealing data at public WiFi hotspots.](https://go.authentic8.com/blog/stealing-data-over-wifi-is-easier-than-you-think). Are you surprised that your phone and laptop blabbers about every wifi network you connected to everyone around? You will learn more about Transport Layer Security and its application to to Web and other platforms and protocols in Chapter 2.

Let us assume nothing exciting happened in transit and you were talking on a secure channel with the app's servers. Sensitive information like passwords are not stored as it is in databases but are hashed and stored. A hash is a garbled representation of your password with almost no way to retrieve your original password from the hash itself. When you try signing in later, the value you type in the password box in the app is hashed and checked against the hash of your password stored in the database. If both matches, you are logged in. This process, known as hashing ensures that even if the database is leaked, it would prove tough to get your password from the hash. 

[[INSERT hash example pic]]

Other information like your email, addresses, location history, messages exchanged are stored in plain text. The piece of code used to compute the hash for passwords is crucial. Any mistake made in designing or implementing the code would be disastrous. Therefore, companies generally go with a proven hashing mechanism and a corresponding fully baked implementation that has been around for some time and has gone through some code reviews. An important consideration for hashing mechanisms is that the function must be slow to run on computers. This is counter-intuitive as nobody wants their code to be slow. But this is done to ensure a hacker cannot just run this code on say all 3 to 8 character strings made from all possible combinations of alphabets, numbers and special characters and then just check the database they hacked for the corresponding hash entries in the table they just created. These kind of table are called rainbow tables. Even with slower hash functions, one may apply additional computing power to work and create this rainbow table. To prevent rainbow tables from working effectively on cracking hashed passwords we generate a small random string and append it to the users password and then generate and save the hash for it. The salt is stored in plain text alongside the hash. Now the attacker has to generate different rainbow tables for each and every password taking into account the salt for the respective hash making the computation mostly in-feasible.  Another characteristic of these hashing mechanisms is that it should avoid generating exactly similar hash for two different passwords or strings (called a hash collision). It should be difficult for someone to just come up with two different strings that will generate the same hash. Note the use of words avoid and difficult, collisions will exist but someone should not purposefully make it happen. This condition proved to be an issue when [researchers were able to generate two different certificates with the same hash using MD5 hashing mechanism thus effectively breaking SSL](http://www.zdnet.com/article/ssl-broken-hackers-create-rogue-ca-certificate-using-md5-collisions/). MD5 was phased out as a hashing mechanism for SSL from all major browsers till 2013 and even a stronger version SHA1 is in process of being phased out till 2017 due to fear of collision.   You will learn the details of hashing and various hashing methods in Chapter 7. If there is a single thing that you need to take remember from this document, it is never to store passwords in plaintext. Just use bcrypt.

But how do the hackers get the data dump in the first place? Most of the times it is fairly easy. [In case of the SONY pictures hack, the hacker group LulzSec is said to have exploited a SQL injection vulnerability](https://go.authentic8.com/blog/stealing-data-over-wifi-is-easier-than-you-think). SQL is a programming language used to fetch and modify data from databases. A major weakness of SQL is the inability to differentiate between code and data. SQL statements when used in programming languages and accepting data inputs directly from users can be manipulated by hackers by supplying code instead of data and getting extra information from the databases. You will learn more about SQL injection, how to prevent them and how to check for it in Chapter 6.

[[Sample of SQL and SQL injection]]

Another common way that hackers get these dumps is by simply connecting on the database machines belong to the companies that have either no passwords and are exposed publicly on the internet or have common or weak passwords. Shodan, a search engine for open ports over the whole internet provides an easy way to search for open databases like MongoDb, MySQL and even the db sizes. Taking a dump of these databases is a matter of minutes to hours depending on the quantity of data. We will learn about various configuration mistakes in Chapter 12.

![Shodan search for open MongoDb instances](images/shodan.png)

Two important questions arise -

1. Why use passwords? The general crowd is extremely bad at selecting strong passwords and tend to reuse the same password across all different internet services. Why do we even need a password to prove that we really are who we claim to be. Is there any other safer way to do this? We evaluate this in Chapter 8 and a password-less way of authentication and much more in Chapter 9.
2. Why store all information on a remote server which has no way to prove to us that our information is being stored securely or heck our password is being hashed at all and is not in plaintext like SONY. is there a way we can store our information with us and deploy our own measures to secure them, depending on how likely we are to be attacked and are free to take our data offline or delete it whenever we want.

Now that the hackers have your password or the hash of your password, it is trivial to login to your account and all other accounts that share the same password. To prevent this, most of the companies have resorted to something called 2 factor authentication. You might have used this in Gmail, Twitter or Facebook login where after you correctly enter your password, you get a security PIN code as a text message on your designated mobile phone. In case of Gmail, you can even use the Google Authentication app which gives you a code to be entered while login. Having the mobile phone or the configured Google Authenticator app adds an extra layer of security that you are who you claim to be. You will learn about authentication and 2 Factor authentication methods in detail in Chapter 4.

You should be aware that 2 Factor authentication using SMS as the second factor is no longer considered secure. It is possible to fool your mobile phone into connecting to a hackers homemade device which acts like a local mobile tower ([more commonly known by the name Stingray](http://resources.infosecinstitute.com/stingray-technology-government-tracks-cellular-devices/)) and then intercept all communications. It is also possible (however difficult) to exploit a vulnerability in the telephonic signalling protocol SS7 to route calls and SMS meant for your phone to a malicious hackers phone.  In some countries, it is fairly easy to call up your telecom provider and request changing of SIM and gain access to your 2 factor SMS codes [as happened with an activist whose Twitter account was hacked despite 2 factor authentication being active](https://www.wired.com/2016/06/deray-twitter-hack-2-factor-isnt-enough/). Infact, [the National Institute of Standards and Technology of the US has deprecated SMS as a medium for 2 factor authentication in August 2016.](https://techcrunch.com/2016/07/25/nist-declares-the-age-of-sms-based-2-factor-authentication-over/) We will examine other methods of secure 2 factor authentication in Chapter 4.

An important part of data leaks is financial data leak. The darknet (a network over internet that is specifically accessed with configured software such as Tor) is full of shady entities selling complete credit card information by bulk. Most of these were part of some data leak or other. A data leak involving card information will impact the entities users severely, sometime months after the leak was disclosed and fixed. 

When you use a third party service like Google Analytics or login via Facebook on your website or app, you get a set of tokens that uniquely identifies and authenticates your app to their systems. These tokens also called API keys are to be used in your code. Sometimes, you might inadvertently expose these tokens publicly for example when pushing your code to a public repository on GitHub. These API keys can then be misused to get data, disrupt your systems or sometimes even wipe down your cloud deployments (e.g. Amazon Web Services API full access keys). In fact, GitHub is full of numerous exposed API keys for nearly all popular API providers.

An important vulnerability for serious stuff like device takeover or stealing remote information is Remote Code Execution or the ability to run an arbitrary piece of code on the target's device, be it a laptop, mobile phone or a [nuclear power plant (e.g. Stuxnet is a sophisticated malware designed to sabotage nuclear facilities)](https://www.wired.com/2014/11/countdown-to-zero-day-stuxnet/). If you go through Apple iOS and macOS security update logs, you will get instances of RCE fixes in almost every other update,  [like for example taking over an Apple device via a simple text message or email](https://www.theguardian.com/technology/2016/jul/22/stagefright-flaw-ios-iphone-imessage-apple). And, [there might be an equivalent bug for Android so that Apple haters are not left out on getting hacked.](https://www.theguardian.com/technology/2015/jul/28/stagefright-android-vulnerability-heartbleed-mobile). There have been a significantly large number of vulnerabilties in Adobe Flash which could be exploited to run a hackers code, one of the reasons for its deprecation from major web browsers.

![Top 50 products by disclosed vulnerabilites](images/cves.png)

**Human Factor in Security**
An important but mostly overlooked aspect of security design and flaws is social engineering attacks. Human interaction is an important aspect of software and is often misused in various non-technical ways to gain illegal access to systems. For example, the hacker who called Verizon to get a new SIM to hack the activist's account probably did not write a single line of code to break 2 factor authentication. A huge population around the world, mainly in developing countries are having their first internet experience right now. Without the right set of information, it would be fairly easy to engineer an attack that would appear legitimate to them and then convince them in exposing their personal information, passwords and payment details. For example a link like https://google.com/amp/gmail-login.website will redirect to gmail-login.website which can be a valid hackers website with Gmail like login page thanks to new TLDs or domain extensions like .website coming up. This is happening at an alarming rate right now. If the dating app in our earlier example has not set their DNS settings for email properly (specifically SPF header, DMARC and DKIM), it would be fairly easy to send a mail appearing to come from their domain and email address and then exploit their users into revealing personal data. DNS settings are explained in detail in Chapter 12. 

A common way to exploit users is to place clickbaity advertisements on porn websites inviting a click which then either asks for personal information and payment details or leads to a malware install. A malware is a small piece of executable that is mostly controlled remotely and performs certain set of instructions like deleting files on the users system, stealing user passwords or can be used to attack other websites and systems in tandem with other malwares installed on various other systems, collectively called as a botnet.


Every system on the internet be it a website or a mobile app like Pokemon Go is capable of handling a certain amount of traffic. When the system is bombarded with a traffic that is much larger than it is designed to handled it collapses, leaving its users with no service. This is commonly known as Denial of Service or DoS attacks. When the traffic generated to bring down systems is coming from not a single system but various sources, it is said to be a distributed DoS. DDoS attacks are generally done via botnets or a network of compromised systems. Recently, a DDoS attack on a DNS service provider brought down almost half of US internet including popular websites like Twitter, Netflix, Reddit, The New York Times, The Guardian and more. [Prelim reports suggest that this was a work of amateur hackers using a popular botnet called Mirai](https://techcrunch.com/2016/10/26/dyn-dns-ddos-likely-the-work-of-script-kiddies-says-flashpoint/). 

Note that the compromised systems need not be computers or mobile phones. It can be any internet connected device which has some processing power and ability to execute code. With devices such as baby monitors, digital cameras connected to internet and most of them with default passwords or weak passwords, it has become fairly easy to create botnets using these devices. Mirai is such a botnet that feeds on insecure IoT devices and [whose code was recently released on GitHub](https://github.com/jgamblin/Mirai-Source-Code) by an anonymous hacker. [Reports suggest Mirai bots have more than doubled since the release of its source code in public](https://threatpost.com/mirai-bots-more-than-double-since-source-code-release/121368/). [Insecam is a directory of  insecure internet enabled cameras](https://www.insecam.org/) that gives you live feeds from around the world.

![Open internet enabled camera feeds on Insecam](images/insecam.png)

DDoS attacks are generally measured in terms of bandwidth of traffic. The recent attack on Dyn, a DNS service provider for major internet services was mesured to be of 1.2Tbps, almost twice the size of the largest DDoS attack of all time, which too happened in 2016.

![A map of live DDoS attacks](images/ddos.png) 


You can read more about DoS attacks in Chapter 13.


In 2005, MySpace (when it still was a thing) was hit by a relatively harmless vulnerability where more than a million users profiles displayed the text 'but most of all, samy is my hero'. And all this happened within a day of its release by 21 year old Samy Kamkar. MySpace has to be shut down for sometime to fix the issue and Samy was raided by the FBI and put on probation with no computer access for three years. This was an example of Cross-Site Srcipting bug, more commonly known as XSS. XSS is one of the most frequently occuring vulnerabilites in web security. XSS allows execution of arbitrary JavaScript code in a users browser when the user visits a vulnerable website. The code can load other remote JavaScript files, steal cookies and session information, deface websites, deceive users into disclosing their secrets appearing as the legitimate website or display some text as in case of MySpace and Samy. XSS is discussed in Chapter 6 and some ways to mitigate them by using modern headers inbuilt in recent HTTP specification are discussed in Chapter 11.

A group of people are of the opinion that [improper input handling is the root cause of almost all security issues](http://www.langsec.org). A programming language or even a general language has usually a defined grammar which aids in giving meaning and structure to content generated in the language. In case of programming languages, a parser is used to process the content and extract something useful with it. Since the parsers are itself pieces of code implemented in some programming language, it can contain issues that might lead to a security vulnerability. Since parsers are pieces of code, a given grammar can be implemented in various different systems by different people leading to an interesting variety of errors. Hand-coding parsers can only make the issue worse. Nginx and Apache are two of the most commonly used open source Web servers in use around the world. In 2013, a bug was discovered in the nginx parser for HTTP headers written in C programming language, [the exactly same bug was discovered and fixed in Apache 11 years back](http://langsec.org/papers/the-bugs-we-have-to-kill.pdf). Another issue in parsing the Name (specifically Common Name) of server in HTTPS certificates could lead [to issuing certificate for a legitimate site to an evil entity](http://langsec.org/papers/Sassaman.pdf).You will learn more about HTTPS in the next chapter and we will examine all input/output related issues in Chapter 6.



Security does not have to make things difficult for end users. If your fancy two factor authentication is cumbersome, it would decrease adoption amongst your users, making your security measures useless. Another common mistake is to not communicate security measures and actions effectively to your users. We discuss UX design issues and tradeoffs for security in Chapter 17.


Before you get started with looking under the hood of systems, you need to be aware of the legal implications. In some countries, something as simple as logging into a open computer belonging to a organization can land you in jail for several years, even if you did it with the right intentions. Most of the larger companies and startups have a bug bounty program for reporting security related issues, where you can report any vulnerability found in the companies systems and can get rewarded for them if the bugs are within the scope of the program. If you are itching to look under the hood, the best way to start would be to find companies with public bug bounty programs and get started with them.


Unless you are an academic researcher, you would almost never get to design cryptography protocols and systems or even implemnent a large enough parser and that is a good thing. Designing and implementing anything new is prone to be buggy and vulnerable till a lot of revisions are done to it. However, you should be aware of the choices available to you and make a informed choice evaluating all the tradeoffs. This guide will help you make better security related decisions. You would be able to avoid pitfalls such as security by obscurity. You would appreciate the need for stronger hash functions and maybe even have a desire to look under the hood whenever you are using a system online, while knowing the repercussions. As you progress through this guide, our aim is to start from nothing and make you understand the most common attack scenarios and ways you as a developer can avoid them. You will be able to avoid common pitfalls while you write code or configure your systems. There might be a percentage of readers who might find this guide slow, please feel free to skip some portions of the guide and mail any feedback to guide@fallible.co

Download .txt

gitextract_zywe575m/

├── .github/
│   └── FUNDING.yml
├── README-zh.md
├── README.md
├── https.md
├── security-checklist-zh.md
├── security-checklist.md
├── vulnerabilities-stats-zh.md
├── vulnerabilities-stats.md
└── what-can-go-wrong.md

Download .json

Condensed preview — 9 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (56K chars).

[
  {
    "path": ".github/FUNDING.yml",
    "chars": 67,
    "preview": "# These are supported funding model platforms\n\ngithub: FallibleInc\n"
  },
  {
    "path": "README-zh.md",
    "chars": 1720,
    "preview": "# 实用性 WEB 开发人员安全须知  \n\n### 目标读者  \n\n安全问题主要由以下两类原因导致：   \n\n1. 那些刚入门的无法区分 MD5 和 bcrypt 作用的开发者  \n2. 那些知道这件事但忘记/忽略了的开发者  \n\n我们的详"
  },
  {
    "path": "README.md",
    "chars": 3538,
    "preview": "# A practical security guide for web developers (Work in progress)\n\n### The intended audience\n\nSecurity issues happen fo"
  },
  {
    "path": "https.md",
    "chars": 11593,
    "preview": "# Securely transporting stuff: HTTPS explained\n\n\n## The problem\nHTTP is the protocol that the browsers use to communicat"
  },
  {
    "path": "security-checklist-zh.md",
    "chars": 4455,
    "preview": "[返回目录](README-zh.md)\n\n\n### 安全checklist   \n\n##### 权限系统 (注册/注册/二次验证/密码重置)\n- [ ] 任何地方都使用 HTTPS.\n- [ ] 使用 `Bcrypt` 存储密码哈希 (没"
  },
  {
    "path": "security-checklist.md",
    "chars": 7577,
    "preview": "[Back to Contents](README.md)\n\n\n### The Security Checklist \n\n##### AUTHENTICATION SYSTEMS (Signup/Signin/2 Factor/Passwo"
  },
  {
    "path": "vulnerabilities-stats-zh.md",
    "chars": 1512,
    "preview": "[返回目录](README-zh.md)\n\n\n### Hackerone 公开漏洞统计  \n\n目前为止，Hackerone 平台已经发现 1731 个公开的漏洞，主要来自 Twitter、Uber、Dropbox、Github 等公司。其中"
  },
  {
    "path": "vulnerabilities-stats.md",
    "chars": 1756,
    "preview": "[Back to Contents](README.md)\n\n\n### Hackerone publicly disclosed bugs Stats\n\nUpdated analysis of HackerOne vulnerability"
  },
  {
    "path": "what-can-go-wrong.md",
    "chars": 22757,
    "preview": "A recently released information about an old data beach of more than 500 million Yahoo users has put question mark on th"
  }
]

About this extraction

This page contains the full source code of the FallibleInc/security-guide-for-developers GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 9 files (53.7 KB), approximately 14.7k tokens. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Extract another repo