Skip to content

Commit

Permalink
first commit
Browse files Browse the repository at this point in the history
  • Loading branch information
yukinying committed Sep 22, 2015
0 parents commit ea76cc3
Show file tree
Hide file tree
Showing 40 changed files with 4,127 additions and 0 deletions.
33 changes: 33 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Compiled Object files, Static and Dynamic libs (Shared Objects)
*.o
*.a
*.so

# Folders
_obj
_test

# Architecture specific extensions/prefixes
*.[568vq]
[568vq].out

*.cgo1.go
*.cgo2.c
_cgo_defun.c
_cgo_gotypes.go
_cgo_export.*

_testmain.go

*.exe
*.test
*.prof

# NSQ temporary files.
*.dat

# logstashes
*.log
logstash-forwarder.crt
logstash-forwarder.key
.logstash-forwarder
27 changes: 27 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
Copyright (c) 2015 Yahoo Inc. All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met:

* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above
copyright notice, this list of conditions and the following disclaimer
in the documentation and/or other materials provided with the
distribution.
* Neither the name of Yahoo Inc. nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
54 changes: 54 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@

# This Makefile is adopted from https://github.com/hashicorp/consul/blob/master/Makefile

DEPS = $(shell go list -f '{{range .TestImports}}{{.}} {{end}}' ./...)

PACKAGES = $(shell go list ./...)
VETARGS?=-asmdecl -atomic -bool -buildtags -copylocks -methods \
-nilfunc -rangeloops -shift -structtags -unsafeptr
#-printf

all: deps format

cov:
gocov test | gocov-html > /tmp/coverage.html
open /tmp/coverage.html

deps:
go get -d -v ./... $(DEPS)

updatedeps: deps
go get -d -f -u ./... $(DEPS)

build: test
cd cmd/gryffin-standalone; go install

test: deps
go test ./...
@$(MAKE) vet

test-mono:
go run cmd/gryffin-standalone/main.go "http://127.0.0.1:8081"
go run cmd/gryffin-standalone/main.go "http://127.0.0.1:8082/dvwa/vulnerabilities/sqli/?id=1&Submit=Submit"


test-integration:
INTEGRATION=1 go test ./...

test-cover: deps
go test --cover ./...

format: deps
@go fmt $(PACKAGES)

vet:
@go tool vet 2>/dev/null ; if [ $$? -eq 3 ]; then \
go get golang.org/x/tools/cmd/vet; \
fi
@go tool vet $(VETARGS) . ; if [ $$? -eq 1 ]; then \
echo ""; \
echo "Vet found suspicious constructs. Please check the reported constructs"; \
echo "and fix them if necessary before submitting the code for reviewal."; \
fi

.PHONY: all cov deps build test vet web web-push
88 changes: 88 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
Gryffin (beta)
==========

Gryffin is a large scale web security scanning platform. It is not a yet another scanner. It was written to solve two specific problems with existing scanners, that of, coverage and scale.

Better coverage translates to fewer false negatives. Inherent scalability translates to, capaility of scanning and supporting a large elastic application infrastructure. Or simply put, the ability to scan 1000 applications today to 100,000 applications tomorrow by straightforward horizontal scaling.

## Coverage
Coverage has two dimensions - one during crawl and the other during fuzzing. In crawl phase, coverage implies, being able to find as much of the application footprint. In scan phase or while fuzzing, it implies, being able to test each part of the application for applied set of vulnerabilities in a deep.

#### Crawl Coverage
Today a large number of web applications are template driven, that means, same code or path generates millions of URLs. For a security scanner, it just needs one of the million URLs that are generated by the same code or path. Gryffin's crawler does just that.

##### Page Deduplication
Gryffin has a deduplication engine at its heart that compares the new page with the already seen pages. If the HTML structure of the new page is similar to the ones seen, it is classified as duplicate and not crawled further.

##### DOM Rendering and Navigation
A large number of applications today are rich applications. They are heavily driven by client-side JavaScript. In order to discover links and code paths in such applications, Gryffin's crawler, uses PhantomJS for DOM rendering and navigation.

#### Scan Coverage
As Gryffin is a scanning platform and not a scanner, it does not have its own fuzzer modules, even for fuzzing common web vulnerabilities like XSS and SQL Injection.

It's not wise to reinvent the wheel where you do not have to. Gryffin at production scale at Yahoo uses open source and custom fuzzers. Some of these custom fuzzers might be open sourced in future and might or might not be part of Gryffin repository.

For demonstration purpose, Gryffin comes integrated with sqlmap and arachni. It does not endorse them or any other scanner in particular.

Philosophy is to improve scan coverage by being able to fuzz for just what you need.

## Scale
While Gryffin is available as a standalone package, it's primarily built for scale.

Gryffin is built on the publisher-subscriber model. Each component is either a publisher or a subscriber or both. This allows Gryffin to scale horizontally by simply adding more subscriber or publisher nodes.

## Operating Gryffin

### Pre-requisite

1. Go
2. PhantomJS, v2
3. Sqlmap (for fuzzing SQLi)
4. Arachni (for fuzzing XSS and web vulnerabilities)
5. NSQ ,
- running lookupd at port 4160,4161
- running nsqd at port 4150,4151
- with `--max-msg-size=5000000`
6. Kibana and Elastic search, for dashboarding
- listening to JSON over port 5000
- Preconfigured docker image available in https://hub.docker.com/r/yukinying/docker-elk/


### Installation

```
go get github.com/yahoo/gryffin/...
```

### Run

#### Example 1: A site with 1M+ URLs
A typical site with millions of URLs like news.yahoo.com is scanned below to show the importance of
TBD: Link to news/finance scan video

#### Example 2: A rich app
TBD: Link to Flickr scan video


## TODO

1. Mobile browser user agent
2. Preconfigured docker images
3. Redis for sharing states across machines
4. Instruction to run gryffin (distributed or standalone)
5. Documentation for html-distance
6. Implement a JSON serializable cookiejar.
7. Identify duplicate url patterns based on simhash result.

## Credits

- Adonis Fung @ Yahoo, for the asynchronous phantomjs based crawler and DOM event navigator.
- Simhash algorithm](http://www.cs.princeton.edu/courses/archive/spring04/cos598B/bib/CharikarEstim.pdf) by Moses Charikar
- Simhash implementation provided by [mfonda/simhash](https://github.com/mfonda/simhash).
- [Sqlmap](http://sqlmap.org/)
- [Arachni](http://www.arachni-scanner.com/)


## Licence

Code licensed under the BSD-style license. See LICENSE file for terms.
Loading

0 comments on commit ea76cc3

Please sign in to comment.