removing registry page (#802)

* removing registry page

* removing registry page

* removing more references to the registry and 'foundation'

* remove readme references to registry, etc...

* remove registry stuff from development docs

* removing registry from README
This commit is contained in:
Aaron Schlesinger
2018-10-25 15:39:38 -07:00
committed by GitHub
parent 322ff26694
commit dbac943f3e
7 changed files with 48 additions and 301 deletions
+6 -14
View File
@@ -1,6 +1,6 @@
# Development Guide for Athens
Both the registry and the proxy are written using the [Buffalo](https://gobuffalo.io/) framework. We chose
The proxy is built on the [Buffalo](https://gobuffalo.io/) framework. We chose
this framework to make it as straightforward as possible to get your development environment up and running.
You'll need Buffalo [v0.12.4](https://github.com/gobuffalo/buffalo/releases/tag/v0.12.4) or later to get started on Athens,
@@ -26,8 +26,7 @@ or whichever binary you want to use with athens
# Services that Athens Needs
Both the proxy and the registry rely on several services (i.e. databases, etc...) to function
properly. We use [Docker](http://docker.com/) images to configure and run those services.
Athens relies on several services (i.e. databases, etc...) to function properly. We use [Docker](http://docker.com/) images to configure and run those services.
If you're not familiar with Docker, that's ok. In the spirit of Buffalo, we've tried to make
it easy to get up and running:
@@ -42,29 +41,22 @@ If you want to stop everything at any time, run `make down`.
Note that `make dev` only runs the minimum amount of dependencies needed for things to work. If you'd like to run all the possible dependencies run `make alldeps` or directly the services available in the `docker-compose.yml` file. Keep in mind, though, that `make alldeps` does not start up Athens or Oympus, but **only** their dependencies.
# Run the Proxy or the Registry
# Run the Proxy
As you know from reading the [README](./README.md) (if you didn't read the whole thing, that's ok. Just read the
introduction), the Athens project is made up of two components:
1. [Package Registry](https://docs.gomods.io/design/registry/)
2. [Edge Proxy](https://docs.gomods.io/design/proxy/)
To run the proxy:
After you've set up your dependencies, the `buffalo` CLI makes it easy to launch the proxy:
```console
cd cmd/proxy
buffalo dev
```
After either `buffalo dev` command, you'll see some console output like:
After `buffalo dev` starts up, you'll see some console output like:
```console
Starting application at 127.0.0.1:3000
```
And you'll be up and running. As you edit and save code, the `buffalo dev` command will notice and automatically
re-compile and restart the server.
And you'll be up and running. As you edit and save code, the `buffalo dev` command will notice and automatically re-compile and restart the server. That makes your life a little easier!
# Run unit tests
+12 -23
View File
@@ -9,39 +9,29 @@
[![Go Report Card](https://goreportcard.com/badge/github.com/gomods/athens)](https://goreportcard.com/report/github.com/gomods/athens)
[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg)](http://makeapullrequest.com)
Welcome to the Athens project! We're building all things Go package repository in here.
1. [Package Registry](https://docs.gomods.io/design/registry/)
2. [Edge Proxy](https://docs.gomods.io/design/proxy/)
Welcome to the Athens project! We are a proxy server for the [Go Modules download API](https://docs.gomods.io/intro/protocol/).
See our documentation site [https://docs.gomods.io](https://docs.gomods.io) for more details on the project.
# Project Status
Project Athens is in a very early alpha release and everything might change.
Don't run it in production, but do play around with it and [contribute](#contributing)
when you can!
Project Athens is in alpha. Things might change, so we recommend that you don't run it for production workloads. We have organizations that are testing it internally, and there is an experimental public proxy server running.
We encourage you to [test it out](https://docs.gomods.io/install/) and [contribute](#contributing) when you can!
# More Details Please!
Although the project is in development, here's where we're going:
Although the project is alpha, here's where we're going:
The package registry and the edge proxy both implement the [vgo download protocol](https://medium.com/@arschles/project-athens-the-download-protocol-2b346926a818), but each one
is intended for different purposes.
The proxy implements the [Go modules download protocol](https://docs.gomods.io/intro/protocol/).
The registry will be hosted globally, and will be "always on" for folks. Anyone will be able to
configure their machine to do a `go get` (right now, it's a `vgo get`) and have it request
packages from the registry.
There is currently an experimental public proxy, and we have plans to host a more stable public proxy with more guarantees. We also have a community of folks who are testing Athens inside their organizations, as an "internal proxy." In either deployment, users set their `GOPROXY` environment variable to point to the Athens proxy of their choice. At that point, `go get`, `go build`, and `go build`s will use the proxy to download dependencies as necessary.
On the other hand, the registry will only host _public_ code. If you have private code, the
edge proxy is your jam. The proxy will store your private code for you, in your database
of choice. It will be designed to also cache packages from the registry, subject to
an exclude list.
Athens proxies are highly configurable, so they can work for lots of different deployments. For example, public proxies can store code in cloud databases and CDNs, while internal "enterprise" deployments can use disk-based (i.e. NFS) storage.
# Development
See [DEVELOPMENT.md](./DEVELOPMENT.md) for details on how to set up your development environment
and start contributing code.
See [DEVELOPMENT.md](./DEVELOPMENT.md) for details on how to set up your development environment and start contributing code.
Speaking of contributing, read on!
@@ -66,11 +56,10 @@ If you're not ready to contribute code yet, there are plenty of other great ways
- Come to our [weekly development meetings](https://docs.google.com/document/d/1xpvgmR1Fq4iy1j975Tb4H_XjeXUQUOAvn0FximUzvIk/edit#)! They are a great way to meet folks, ask questions, find some stuff to work on, or just hang out if you want to. Just like with this project, absolutely everyone is welcome to join and participate in those
- Get familiar with the system. There's lots to read about. Here are some places to start:
- [Gentle Introduction to the Project](https://medium.com/@arschles/project-athens-c80606497ce1) - the basics of why we started this project
- [The Download Protocol](https://medium.com/@arschles/project-athens-the-download-protocol-2b346926a818) - the core API that the registry and proxies implement and CLIs use to download packages
- [Registry Design](https://docs.gomods.io/design/registry/) - what the registry is and how it works
- [The Download Protocol](https://medium.com/@arschles/project-athens-the-download-protocol-2b346926a818) - the core API that the proxy implements and the `go` CLI uses to download packages
- [Proxy Design](https://docs.gomods.io/design/proxy/) - what the proxy is and how it works
- [vgo wiki](https://github.com/golang/go/wiki/vgo) - context and details on how Go dependency management works in general
- ["Go and Versioning"](https://research.swtch.com/vgo) - long papers on Go dependency management details, internals, etc...
- [Go modules wiki](https://github.com/golang/go/wiki/Modules) - context and details on how Go dependency management works in general
- ["Go and Versioning"](https://research.swtch.com/vgo) - long articles on Go dependency management details, internals, etc...
# Built on the Shoulders of Giants
+17 -25
View File
@@ -5,8 +5,12 @@ date: 2018-02-11T15:59:56-05:00
## The Athens Proxy
The Athens project has two components, the [central registry](/design/registry/) and edge proxies.
This document details the latter.
The proxy has two primary use cases:
- Internal deployments
- Public mirror deployments
This document details features of the proxy that you can use to achieve either use case.
## The Role of the Proxy
@@ -16,17 +20,7 @@ We intend proxies to be deployed primarily inside of enterprises to:
- Exclude access to public modules
- Cache public modules
Importantly, a proxy is not intended to be a complete _mirror_ of an upstream registry. For public modules, its role is to cache and provide access control.
## Proxy Details
First and foremost, a proxy exposes the same vgo download protocol as the registry. Since it doesn't have the multi-cloud requirements as the registry does, it supports simpler backend data storage mechanisms. We plan to release a proxy with several backends including:
- In-memory
- Disk
- Cloud blob storage
Users who want to target a proxy configure their `vgo` CLI to point to the proxy, and then execute commands as normal.
Importantly, a proxy is not intended to be a complete _mirror_ of an upstream proxy. For public modules, its role is to cache and provide access control.
## Cache Misses
@@ -37,21 +31,19 @@ If it's private, it immediately does a cache fill operation from the internal VC
If it's not private, the proxy consults its exclude list for non-private modules (see below). If `MxV1` is on the exclude list, the proxy returns 404 and does nothing else. If `MxV1` is not on the exclude list, the proxy executes the following algorithm:
```
registryDetails := lookupOnRegistry(MxV1)
if registryDetails == nil {
return 404 // if the registry doesn't have the thing, just bail out
upstreamDetails := lookUpstream(MxV1)
if upstreamDetails == nil {
return 404 // if the upstream doesn't have the thing, just bail out
}
return registryDetails.baseURL
return upstreamDetails.baseURL
```
The important part of this algorithm is `lookupOnRegistry`. That function queries an endpoint on the registry that either:
The important part of this algorithm is `lookUpstream`. That function queries an endpoint on the upstream proxy that either:
- Returns 404 if it doesn't have `MxV1` in the registry
- Returns the base URL for MxV1 if it has `MxV1` in the registry
- Returns 404 if it doesn't have `MxV1` in its storage
- Returns the base URL for MxV1 if it has `MxV1` in its storage
Finally, if `MxV1` is fetched from a registry server, a background job will be created to periodically check `MxV1` for deletions and/or deprecations. In the event that one happens, the proxy will delete it from the cache.
_In a later version of the project, we may implement an event stream on the registry that the proxy can subscribe to and listen for deletions/deprecations on modules that it cares about_
_In a later version of the project, we may implement an event stream on proxies that any other proxy can subscribe to and listen for deletions/deprecations on modules that it cares about_
## Exclude Lists and Private Module Filters
@@ -64,9 +56,9 @@ To accommodate private (i.e. enterprise) deployments, the proxy maintains two im
Private module filters are string globs that tell the proxy what is a private module. For example, the string `github.internal.com/**` tells the proxy:
- To never make requests to the public internet (i.e. to the registry) regarding this module
- To never make requests to the public internet (i.e. to upstream proxies) regarding this module
- To download module code (in its cache filling mechanism) from the VCS at `github.internal.com`
### Exclude Lists for Public Modules
Exclude lists for public modules are also globs that tell the proxy what modules it should never download from the registry. For example, the string `github.com/arschles/**` tells the proxy to always return `404 Not Found` to clients.
Exclude lists for public modules are also globs that tell the proxy what modules it should never download from any upstream proxy. For example, the string `github.com/arschles/**` tells the proxy to always return `404 Not Found` to clients.
-213
View File
@@ -1,213 +0,0 @@
---
title: "Registry"
date: 2018-02-11T15:58:56-05:00
---
## The Athens Registry
The Athens registry is a Go package registry service that is hosted globally across multiple cloud providers. The **global deployment** will have a DNS name (i.e. `registry.golang.org`) that round-robins across each **cloud deployment**. We will use the following **cloud deployments** for _example only_ in this document:
- Microsoft Azure (hosted at `microsoft.registry.golang.org`)
- Google Cloud (hosted at `google.registry.golang.org`)
- Amazon AWS (hosted at `amazon.registry.golang.org`)
Regardless of which **cloud deployment** is routed to, the **global deployment** must provide up-to-date (precise definition below) module metadata & code.
We intend to create a foundation (the TBD foundation) that manages **global deployment** logistics and governs how each **cloud deployment** participates.
## Glossary
In this document, we will use the following keywords and symbols:
- `OA` - the registry **cloud deployment** hosted on Amazon AWS
- `OG` - the registry **cloud deployment** hosted on Google Cloud
- `OM` - the registry **cloud deployment** hosted on Microsoft Azure
- `MxVy` - the module `x` at version `y`
## Properties of the Registry
The registry should obey the following invariants:
- No existing module or version should ever be deleted or modified
- Except for exceptional cases, like a DMCA takedown (more below)
- Module metadata & code may be eventually consistent across **cloud deployments**
These properties are both important to design the **global deployment** and to ensure repeatable builds in the Go community as much as is possible.
## Technical Challenges
A registry **cloud deployment** has two major concerns:
- Sharing module metadata & code
- Staying current with what other registry **cloud deployment**s are available
For the rest of this document, well refer to these concerns as **data exchange** and **membership**, respectively.
Registries will use separate protocols to do **data exchange** and **membership**.
## Data Exchange
The overall design of the **global deployment** should ensure the following:
- Module metadata and code is fetched from the appropriate source (i.e. a VCS)
- Module metadata and code is replicated across all **cloud deployment**s. As previously stated, replication may be eventually consistent.
Each **cloud deployment** holds:
- A module metadata database
- A log of actions it has taken on the database (used to version the module database)
- Actual module source code and metadata
- This is what vgo requests
- Likely stored in a CDN
The module database holds metadata and code for all modules that the cloud deployment is aware of, and the log records all the operations the cloud deployment has done in its lifetime.
## The Module Database
The module database is made up of two components:
- A blob storage system (usually a CDN) that holds module metadata and source code
- This is called the module CDN
- A key/value store that indicates whether and where a module MxV1 exists in the **cloud deployment**'s blob storage
- This is called the module metadata database, or key/value storage
If a **cloud deployment** OM holds modules `MxV1`, `MxV2` and `MyV1`, its module metadata database would look like the following:
```
Mx: {baseLocation: mycdn.com/Mx}
My: {baseLocation: mycdn.com/My}
```
Note that `baseLocation` is intended for use in the `<meta>` redirect response passed to vgo. As a result, it may point to other **cloud deployment** blob storage systems. More information on that in the synching sections below.
## The Log
The log is an append-only record of actions that a **cloud deployment** OM has taken on its module database. The log exists only to facilitate module replication between **cloud deployment**s (more on how replication below).
Below is an example event log:
```
ADD MxV1 ID1
ADD MxV2 ID2
ADD MyV1.5 ID3
```
This log corresponds to a database that looks like the following:
```
Mx: {baseLocation: mycdn.com/Mx}
My: {baseLocation: mycdn.com/My}
```
And blob storage that holds versions 1 and 2 of Mx and version 1.5 of My.
### Log IDs
Note that each event log line holds ID data (`ID1`, `ID2`, etc...). These IDs are used to by other **cloud deployment**s as database versions. Details on how these IDs are used are below in the pull sync section.
## Cache Misses
If an individual **cloud deployment** OM gets a request for a module MxV1 that is not in its database, it returns a "not found" (i.e. HTTP 404) response to vgo. Then, the following happens:
- OM starts a background cache fill operation to look for MxV1 on OA and OG
- If OA and OG both report a miss, OM does a cache fill operation from the VCS and does a push synchronization (see below)
- vgo downloads code directly from the VCS on the client's machine
## Pull Sync
Each **cloud deployment** will actively sync its database with the others. Every timer tick `T a **cloud deployment** OM will query another **cloud deployment** OA for all the modules that changed or were added since the last time OM synched with OA.
### Query Mechanism
The query obviously relies on OA being able to provide deltas of its database over logical time. Logical time is communicated between OM and OA with log IDs (described above). The query algorithm is approximately:
```
lastID := getLastQueriedID(OA)
newDB, newID := query(OA, lastID) // get the new operations that happened on OA's database since lastID
mergeDB(newDB) // merge newDB into my own DB
storeLastQueriedID(OA, newID) // after this, getLastQueriedID(OA) will return newID
```
The two most important parts of this algorithm are the `newDB` response and the `mergeDB` function.
#### Database Diffs
OA uses its database log to construct a database diff starting from the `lastID` value that it receives from OM. It then sends the diff to OM in JSON that looks like the following:
```json
{
"added": ["MxV1", "MxV2", "MyV1"],
"deleted": ["MaB1", "MbV2"],
"deprecated": ["MdB1"]
}
```
Explicitly, this structure indicates that:
- `MxV1`, `MxV2` and `MyV1` were added since `lastID`
- `MaB1` and `MbV2` were deleted since `lastID`
- `MdB1` was deprecated since `lastID`
#### Database Merging
The `mergeDB` algorithm above receives a database diff and merges the new entries into its own database. It follows a few rules:
- Deletes insert a tombstone into the database
- If a module `MdV1` is tombstoned, all future operations that come via database diffs are sent to `/dev/null`
- If module `MdV2` is deprecated, future add or deprecation diffs for `MdV2` are sent to `/dev/null`. Future delete operations can still tombstone
The approximate algorithm for `mergeDB` is this:
```
func mergeDB(newDB) {
for added in newDB.added {
fromDB := lookup(added)
if fromDB != nil {
break // the module already exists (it may be deprecated or tombstoned), bail out
}
addToDB(added) // this adds the module to the module db's key/value store, but points baseLocation to the other cloud deployment's blob storage
go downloadCode(added) // this downloads the module to local blob storage, then updates the key/value store's baseLocation accordingly
}
for deprecated in newDB.deprecated {
fromDB := lookup(deprecated)
if fromDB.deleted() {
break // can't deprecated something that's already deleted
}
deprecateInDB(deprecated) // importantly, this function inserts a deprecation record into the DB even if the module wasn't already present!
}
for deleted in newDB.deleted {
deleteInDB(deleted) // importantly, this function inserts a tombstone into the DB even if the module wasn't already present!
}
}
```
## Push Sync
If a **cloud deployment** OM has a cache miss on a module MxV1, does a cache fill operation and discovers that no other **cloud deployment** OG or OA have MxV1, it fills from the VCS. After it finishes the fill operation, it saves the module code and metadata to its module database and adds a log entry for it. The algorithm look like the following:
```
newCode := fillFromVCS(MxV1)
storeInDB(newCode)
storeInLog(newCode)
pushTo(OA, newCode) // retry and give up after N failures
pushTo(OG, newCode) // retry and give up after N failures
```
The `pushTo` function is most important in this algorithm. It _only_ sends the existence of a new module, but no event log metadata (i.e. `lastID`):
```
func pushTo(OA, newCode) {
http.POST(OA, newCode.moduleName, newCode.moduleVersion, "https://OM.com/fetch")
}
```
The endpoint in OA that receives the HTTP `POST` request in turn does the following:
```
func receive(moduleName, moduleVersion, fetchURL) {
addToDB(moduleName, moduleVersion, OM) // stores moduleName and moduleVersion in the key/value store, with baseLocation pointing to OM
go downloadCode(added) // this downloads the module to local blob storage, then updates the key/value store's baseLocation accordingly
```
Note again that `lastID` is not sent. Future pull syncs that OA does from OM will receive moduleName/moduleVersion in the 'added' section, and OA will properly do nothing because it already has moduleName/moduleVersion.
+11 -24
View File
@@ -4,19 +4,17 @@ description: Frequently Asked Questions
menu: shortcuts
---
### Is Athens just a proxy? A registry?
### Is Athens Just a Proxy? A Registry?
_TL;DR We've discovered that "registry" doesn't describe what we're trying to do here. The term "global proxy pool" is probably a better description, but it's still an open question._
_TL;DR "Registry" doesn't describe what Athens is trying to do here. That implies that there's only one service in the world that can serve Go modules to everyone. Athens isn't trying to be that. Instead, Athens is trying to be part of a federated group of module proxies._
A registry is generally run by one entity, is one logical server that provides authentication (and provenance sometimes), and is pretty much the de-facto only source of dependencies. Sometimes it's run by a for-profit company.
That's most definitely not what we in the Athens community are going for, and that would harm our community if we did go down that path.
We think that a federated discovery/auth/provenance system is a great resource for folks building proxies, and although it's young, we think that the Athens proxy is growing into a good quality implementation. But it doesn't have to be the only one.
First and foremost, Athens is an _implementation_ of the [Go Modules download API](./intro/protocol). Not only does the standard Go toolchain support any implementation of that API, the proxy is designed to talk to any other server that implements that API as well. That allows Athens to talk to other proxies in the community.
We're purposefully building this project - and working with the toolchain folks - in a way that everyone who wants to write a proxy can participate. Even if they don't use the federated bits.
So, if you look back to "architecture" above, there are a few discrete "things" involved in this system we're building. The term "proxy" describes what it's trying to do fairly well. But there are other things going on too. The term "global proxy pool" covers everything in the global, federated system.
Finally, we're purposefully building this project - and working with the toolchain folks - in a way that everyone who wants to write a proxy can participate.
### Does Athens integrate with the go toolchain?
@@ -26,14 +24,6 @@ For the TL;DR of the protocol, it's a REST API that lets the go toolchain (i.e.
Athens is a server that implements the protocol. Both it, the protocol and the toolchain (as you almost certainly know) is open source.
### Is Athens a centralized registry?
We have in mind an architecture that:
- Provides a centralized authentication system, code provenance, and discovery system for modules in VCSs
- Is run by many companies, likely under a foundation
- Allows proxies (i.e. the Athens proxy) to use it, if they want, to serve go modules that live on public VCSs
### Are the packages served by Athens immutable?
_TL;DR Athens does store code in CDNs and has the option to store code in other persistent datastores._
@@ -42,28 +32,25 @@ The longer version:
It's virtually impossible to ensure immutable builds when source code comes from Github. We have been annoyed by that problem for a long time. The Go modules download protocol is a great opportunity to solve this issue. The Athens proxy works pretty simply at a high level:
1. go get github.com/my/module@v1 happens
1. `go get github.com/my/module@v1` happens
1. Athens looks in its datastore, it's missing
1. Athens downloads github.com/my/module@v1 from Github (it uses go get on the backend too)
1. Athens downloads `github.com/my/module@v1` from Github (it uses go get on the backend too)
1. Athens stores the module in its datastore
1. Athens serves github.com/my/module@v1 from its datastore forever
1. Athens serves `github.com/my/module@v1` from its datastore forever
To repeat, "datastore" means a CDN (we currently have support for Google Cloud Storage and Azure Blob Storage and AWS S3) or another datastore (we have support for MongoDB, disk and some others).
And, sidenote - we don't have many concrete details on this aforementioned foundation/group, but we would like to see them pay for CDN hosting (and other hosting fees). We are
currently coordinating with hosting providers on these questions.
To repeat, "datastore" means a CDN (we currently have support for Google Cloud Storage, Azure Blob Storage and AWS S3) or another datastore (we have support for MongoDB, disk and some others).
One final note - we use "caching" in lots of our docs, and that's technically wrong because no data is evicted or expires. We'll need to update that terminology.
### Can the proxy authenticate to private repositories?
_tldr: yes, with proper authentication configuration defined on the Athens proxy host._
_TL;DR: yes, with proper authentication configuration defined on the Athens proxy host._
When the GOPROXY environment variable is set on the client-side, the go 1.11+ cli
When the GOPROXY environment variable is set on the client-side, the Go 1.11+ cli
does not attempt to request the meta tags, via a request that looks like `https://example.org/pkg/foo?go-get=1`.
Internally Athens uses `go get` under the hood (`go mod download` to be exact)
without the GOPROXY environment variable set so that `go` will in turn request
without the `GOPROXY` environment variable set so that `go` will in turn request
the meta tags using the standard authentication mechanisms supported by `go`.
Therefore, if `go` before v1.11 worked for you, then go 1.11+ with GOPROXY
should work as well, provided that the Athens proxy host is configured with the
+1 -1
View File
@@ -21,4 +21,4 @@ We intend proxies to be deployed primarily inside of enterprises to:
* Exclude access to public modules
* Cache public modules
Importantly, a proxy is not intended to be a complete mirror of an upstream registry. For public modules, its role is to cache and provide access control.
Importantly, a proxy is not intended to be a complete mirror of an upstream proxy. For public modules, its role is to cache and provide access control.
+1 -1
View File
@@ -8,7 +8,7 @@
{{else}}
<img src="/banner.png"/>
<h1>Athens: Registry and Proxy for Go Modules</h1>
<h1>Athens: Proxy for Go Modules</h1>
<p>Welcome to the Athens documentation.</p>
<ul>
<li><b>1. </b> Create an _index.md document in <b>content</b> folder and fill it with Markdown content</li>