[go: up one dir, main page]

Skip to content

Preprocess zip archives on load and cache file structure

Preprocess zip archive and cache relevant files:

  • cache only relevant list of files with references where to fetch them in an efficient way
  • cache into a map to have O(logn) search

Reference code !326 (diffs)

Depends on #443 (closed)

Considerations

A noticeable change came up during the profiler demo with ~"team::Scalability" where we saw that readArchive is allocating a considerable amount of memory (about 28MB for the top 1%) for docs.gitlab.com alone in production.

Profiler on GCP (internal)

Screen_Shot_2020-10-08_at_3.21.17_pm

Allocated memory spike after enabling Zip for 5% of Pages projects on 2020-10-08 ~12:30 UTC

Screen_Shot_2020-10-12_at_10.52.25_am


The following discussion from !299 (closed) should be addressed: - [ ] @ayufan started a discussion:
> Oh, yes, yes, yes. It would really help to convert a flat list into:
> 
> - `map[string]zip.File`, ideally with `zip.File` having an `offset` as well
> 
> And, drop all the ones that are not within `public/`

Cache flat list idea !299 (comment 377210617)

Edited by Jaime Martinez