geminispace.info

Unnamed repository; edit this file 'description' to name the repository.
git clone git://code.clttr.info/geminispace.info.git
Log | Files | Refs | README | LICENSE

commit dd46f0e29bf49df3637eccb28477d409c6e698d7
parent dbec660e8274c79bd73420d745db8356019c566e
Author: Natalie Pendragon <natpen@natpen.net>
Date:   Fri, 29 May 2020 14:40:56 -0400

[serve] Make sure two closely-timed seed requests don't break

This will prevent seed requests' incremental crawls from stomping on
each other, but due to the way in which incremental crawls
resolve (i.e., by restarting the entire GUS serve process via
systemctl), it also means any seed requests that came in after the
first will not be handled until either A) another seed request comes
in that ends up dealing with it, or B) a manual crawl is kicked off.
The situation is no worse than before however, so this is still an
improvement in the short-term.

Diffstat:
Mgus/serve.py | 13+++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/gus/serve.py b/gus/serve.py @@ -29,6 +29,7 @@ gemini_highlighter = highlight.Highlighter( scorer=GeminiScorer(), order=highlight.SCORE, ) +crawl_thread_lock = threading.Lock() def load_and_compute_statistics(filename): statistics = load_last_statistics_from_file(filename) @@ -380,8 +381,16 @@ def search(request): def crawl_seed_and_restart(seed_url): - run_crawl(should_run_destructive=False, seed_urls=[seed_url]) - call(["sudo", "systemctl", "restart", "gus.service"]) + # NB: this lock will never get released under normal conditions, as the + # expected conclusion of the incremental crawl thread is issue a call + # to systemctl to restart the entire GUS serve process. That new process + # will reinitialize everything, including a fresh, unlocked Lock object. + # However, if the incremental crawl thread crashes for some reason, it + # should catch the exception and release the lock, so new seed requests + # can kick off their own incremental crawls. + with crawl_thread_lock: + run_crawl(should_run_destructive=False, seed_urls=[seed_url]) + call(["sudo", "systemctl", "restart", "gus.service"]) @app.route("/add-seed")