Some time ago I was working on a side project using Clojure
. The project was an advertisement aggregator, it was scraping and collecting data from different websites, normalizing it and presenting simple and powerfull search for end users.
The project was done in Clojure and consisted of few services which were sharing some common code
. Data validation, data schemas and so on. It was ok while development was fully going locally on my machine.
Clojure is Java-based language and because of that, it uses Jar files
as build target format. When you build your library locally
, build tool installs Jar file in a global cache
on your machine. All projects that depend on this library can access it looking it up in the global cache.
But as soon as I started moving everything to AWS, the project build process changed. Now to build an application that depends on shared code, build tool needed to access artifacts of the shared library from somewhere. Because there was no place where build tool can find it, builds were failing. The question how to access shared Jar
files appeared.
Initially, I wasted some time trying to configure S3 plugin
for leiningen(Clojure build tool). The plugin was broken and outdated. Next idea was to use some existing solution
. For example Nexus
repository widely used by Java developers. But minimal resource requirements for it were higher than the resources needed for all my services on AWS. Nexus required at minimum 4GB RAM
.
In the modern era, 4GB sounds like nothing. But it gave birth to such a frustration, that I come up with an idea to reverse engineer
the way leingein worked. Using traffic sniffing
I found out that basically, leiningen is using only two HTTP endpoints
.
First for reading and second for writing build artifacts:
GET /*filepathPOST /*filepath
On GET
it should return artifacts files. Leingein requests it this way:
GET https://nexus.atsman.com/repository/releases/com/atsman/my-library.jar
On POST
it sends build artifacts in the same way, building a hierarchy of files.
POST https://nexus.atsman.com/repository/releases/com/atsman/my-library.pom
It could be done even in one simple lambda function, but that time I decided to write my own simple, small artifacts repo in Golang. And deploy it in k8s cluster.
Why Go? There are so many languages out there that I stick to next pragmatic approach to development
. Use tools that suit best your needs. For frontend development - Js / Ts. Need to analyze some data - use Python and data science stack. For everything related to infrastructure - Golang
. Ease of development, cross-compilation, speed of compilation, binary executable target, tiny docker images...
My initial plan was to build as small and as light as possible service, implementing the next requirements:
The first thing to choose was the routing library. There are tons of them on Github. Gin
, Echo
, Goji
endless list...
I decided to go with [httprouter](https://github.com/julienschmidt/httprouter)
. It is super simple and has everything needed.
Basic file controller looks in a next way:
type FileController struct {Storage Storage}func NewFileController(storage Storage) *FileController {return &FileController{Storage: storage,}}func (ctrl *FileController) GetFile(w http.ResponseWriter, r *http.Request, ps httprouter.Params) {// ...}func (ctrl *FileController) PutFile(w http.ResponseWriter, r *http.Request, ps httprouter.Params) {// ...}
It depends on abstract Storage interface. That interface can be implemented in any way. It can be fs, it can s3, it can be whatever storage you can imagine. FileController doesn't know what is inside.
Actuall implementation can be found here: Controller
For storage implementation I designed the next abstract interface:
type Storage interface {ReadFile(path string) (io.ReadCloser, error)WriteFile(path string, file io.ReadCloser) error}
And factory function to be able to create storage of needed type:
func NewStorage(config StorageConfig) Storage {switch config.Type {case "s3":return NewS3Storage(config)case "fs":return NewFileSystemStorage(config)default:panic("Unknown storage type")}}
FileSystemStorage:
type FileSystemStorage struct {Config StorageConfig}func NewFileSystemStorage(config StorageConfig) *FileSystemStorage {return &FileSystemStorage{Config: config,}}func (fs *FileSystemStorage) ReadFile(path string) (io.ReadCloser, error) {fullPath := resolvePath(fs.Config.BaseDir, path)return os.Open(fullPath)}func (fs *FileSystemStorage) WriteFile(path string, file io.ReadCloser) error {fullPath := resolvePath(fs.Config.BaseDir, path)directoryPath, _ := parseFilepath(fullPath)os.MkdirAll(directoryPath, os.ModePerm)outFile, err := os.Create(fullPath)if err != nil {return err}defer outFile.Close()_, err = io.Copy(outFile, file)return err}
S3Storage:
type S3Storage struct {Config StorageConfigS3Client *s3.S3S3Uploader *s3manager.Uploader}func NewS3Storage(config StorageConfig) *S3Storage {// ...}func (s *S3Storage) ReadFile(path string) (io.ReadCloser, error) {ctx := context.Background()result, err := s.S3Client.GetObjectWithContext(ctx, &s3.GetObjectInput{Bucket: aws.String(s.Config.Bucket),Key: aws.String(path),})if err != nil {return nil, err}return result.Body, nil}func (s *S3Storage) WriteFile(path string, file io.ReadCloser) error {upParams := &s3manager.UploadInput{Bucket: aws.String(s.Config.Bucket),Key: aws.String(path),Body: file,}_, err := s.S3Uploader.Upload(upParams)return err}
For configuration management, I decided to go with viper. Widely used library.
func NewConfig() Config {// ...config := Config{}config.HTTP.Address = viper.GetString("http.addr")config.HTTP.HTTPS = viper.GetBool("http.https")config.HTTP.Certificate = viper.GetString("http.crt")config.HTTP.Key = viper.GetString("http.key")config.HTTP.Username = viper.GetString("http.username")config.HTTP.Password = viper.GetString("http.password")config.Storage.Type = viper.GetString("storage.type")config.Storage.Bucket = viper.GetString("storage.bucket")config.Storage.AccessKey = viper.GetString("storage.access_key")config.Storage.SecretKey = viper.GetString("storage.secret_key")config.Storage.BaseDir = viper.GetString("storage.base_dir")if config.Storage.BaseDir == "" {config.Storage.BaseDir = "./.nexus"}return config}
Application searches for config.yml
in /etc/nexus-minimal
or in the same directory where you run the binary.
It is also possible to specify CONFIG_PATH
env variable, a full path with filename and extension.
For s3:
---http:addr: ":443"username: "myuser"password: "mypassword"https: truecrt: "/certs/domain.crt"key: "/certs/domain.key"storage:type: "s3"bucket_name: "my-super-nexus-bucket"access_key: "xxxxxxxxxxxxxxxxxxx"secret_key: "xxxxxxxxxxxxxxxxxxxxxxxxxxx"
And for file system:
---http:addr: ":8080"username: "myuser"password: "mypassword"storage:type: "fs"base_dir: "/tmp/nexus-minimal"
In the main function, we initialize configuration, reading and converting it in into Go structs for ease of use. After NewStorage returns a specific implementation of Storage interface based on what is in config. Then HTTP controller is instantiated with specific storage passed as a dependency.
func main() {InitLogger()config := NewConfig()storage := NewStorage(config.Storage)controller := NewFileController(storage)InitHttp(config.HTTP, controller)}
Thanks to Go it is possible to create minimal docker image based on the alpine image.
This file uses a docker multi-stage build. Initially, it builds a project using a golang image as a base and then copies binary to the minimal alpine image.
Dockerfile
FROM golang:latest as builderWORKDIR /go/src/github.com/atsman/nexus-minimal/COPY . .RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o ./out/nexus-minimal ./internalFROM alpine:latestRUN apk --no-cache add ca-certificatesVOLUME ["/etc/nexus-minimal"]WORKDIR /root/COPY --from=builder /go/src/github.com/atsman/nexus-minimal/out .CMD ./nexus-minimal
How to use it?
docker run -d -v /etc/nexus-minimal:/etc/nexus-minimal -p 8080:8080 astma/nexus-minimal
Some things only look complex, hiding really basic staff under candy wrapper. Of course, real-world implementation has much more details hidden in thousands of lines of Java code. But for my specific need, 5 structs are more than enough.
Nexus minimal: