0
Under review

Bibbase is accessing cached version of bibtex file

Dhaivat 4 months ago updated 4 months ago 8

Hi,

I am using Bibbase for the publication page for our lab group, and I am hosting the bibtex file on our university's website server. The issue I am facing with bibbase is even though I update the bibtex file (available here: https://research.seas.ucla.edu/licos/files/2019/08/publicactions_v1.bib), the changes are not reflected on the publications page of our website. It seems that bibbase is still accessing the older version of the file instead of the updated file. I have tried using the option 'nocache=1', but it still does not update the content of the page with the updated bibtex file.

Can you please take a look into this and let me know?

Thanks,

Dhaivat

Under review

Hi Dhaivat,

Happy to take a look! Can you send me also the URL of the publication page where this bibtex file is used?

Thanks.

I think I found it. It https://www.licos.ee.ucla.edu/publications/ right?

Can you help me find the change you made to the bibtex file that you don't see updating on the page? The page has 301 publications, and the bibtex files also contains 301 entries, so it's not obvious what the difference is.

Thanks.

Hi Christian,


Yes, the link is correct, and the number of entries displayed is correct as well.

I have made the following changes:

1. I have added URL and abstract to the following publication which does not appear on the page:

@article{nikolopoulos2020group,
  title={Group testing for overlapping communities},
  author={Nikolopoulos, Pavlos and Srinivasavaradhan, Sundara Rajan and Guo, Tao and Fragouli, Christina and Diggavi, Suhas},
  journal={arXiv preprint arXiv:2012.02804},
  year={2020},
  type={1},
  tags={journalSub,PET},
}

2. I have added an author to the following publication but it does not appear as well:

@article{nikolopoulos2020community,
 abstract = {Group testing pools together diagnostic samples to reduce the number of tests needed to identify infected members in a population. The observation we make in this paper is that we can leverage a known community structure to make group testing more efficient. For example, if n population members are partitioned into F families, then in some cases we need a number of tests that increases (almost) linearly with kf, the number of families that have at least one infected member, as opposed to k, the total number of infected members. We show that taking into account community structure allows to reduce the number of tests needed for adaptive and non-adaptive group testing, and can improve the reliability in the case where tests are noisy.},
 author = {Nikolopoulos, Pavlos and Guo, Tao and Fragouli, Christina and Diggavi, Suhas},
 journal = {arXiv preprint arXiv:2007.08111},
 tags = {journalSub,PET},
 title = {Community aware group testing},
 type = {1},
 url_arxiv = {https://arxiv.org/abs/2007.08111},
 year = {2020}
}

Please let me know if you need any other information.


Thanks,

Dhaivat

This looks like an issue with your hosting provider, which is where the bibtex file is hosted. They seem to use a CDN (content delivery network, essentially a globally distributed cache) and the BibBase server seems to be getting the same result each time it requests your bib file. Yahya Ezzeldin from https://research.seas.ucla.edu/arni/ actually reported the same issue just recently.


> curl -I https://research.seas.ucla.edu/licos/files/2019/08/publicactions_v1.bib

HTTP/2 302

server: openresty

date: Thu, 17 Dec 2020 01:40:32 GMT

content-type: image/bib

location: https://cpb-us-w2.wpmucdn.com/research.seas.ucla.edu/dist/b/22/files/2019/08/publicactions_v1.bib

cache-control: public, max-age=31536000

etag: 1c2adaf15f4bd8d6126f09d2e8e66643

x-cache: BYPASS

x-cache-bypass-reason: Arguments found

If I interpret that max-age for the cache correctly then the problem may solve itself about 9 hours. after your editing. But I'll also look for a way to request bibbase to work around this (there might be ways to force the CDN provider to refetch the latest version).

In case it helps, this is the response the bibbase server is currently receiving back from your host:

2020-12-17T17:44:37.909Z [get] parsing bibtex response {
statusCode: 200,
content: '@article{nikolopoulos2020group,\n' +
' title={Group testing for overlapping communities},\n' +
' author={Nikolopoulos, Pavlos and Srinivasavaradhan, Sundara Rajan and Guo, Tao and Fragouli, Christina and Diggavi, Suhas},\n' +
' journal={arXiv preprint arXiv:2012.02804},\n' +
' year={2020},\n' +
' type={1},\n' +
' tags={journalSub,PET},\n' +
'}\n' +
...

with these response headers:

headers: {
  server: 'nginx',
  date: 'Thu, 17 Dec 2020 17:44:37 GMT',
  'content-type': 'application/octet-stream',
  'content-length': '277003',
  connection: 'keep-alive',
  'x-amz-id-2': 'rqGQLN0EbX6/JSJzs/ZjBOgBq8HuCMiGDYvlamKaGWUpCeTy28smhHA0GbYqcJxRt3IF4Dx0Xn4=',
  'x-amz-request-id': '562041A0612D8C76',
  'last-modified': 'Sat, 12 Dec 2020 21:23:39 GMT',
  etag: 'W/"50858318b7745df8c1eaad3520bac5ea"',
  'x-amz-version-id': 'jg0nqRDISTsgIq6hVezjG9BydNPTdUtU',
  expires: 'Sun, 12 Dec 2021 17:44:14 GMT',
  'cache-control': 'max-age=31104000',
  'access-control-allow-origin': '*',
  vary: 'Accept-Encoding',
  'x-cache': 'HIT',
  'accept-ranges': 'bytes'
},

You may want to try a different hosting provider for your bibtex file, or you can let BibBase host your bibtex file directly in case one of our premium plans makes sense for your group(s).

Thank you. I see that the last modified time stamp on the file is of Dec 12 from your message. The changes I made to the bibtex file that I mentioned in this thread earlier are done on Dec 12 (after the mentioned modifications time stamp in the message above), and on Dec 16. But here is another issue that I am still not clear about:

When I download the bib file using the link (https://research.seas.ucla.edu/licos/files/2019/08/publicactions_v1.bib), I am able to see the changes made by me on Dec 12 in the downloaded version of the file (which is change 1 in the mentioned changes to the bibtex file in this thread). Therefore, I assume that the older version is no longer cached by the hosting provider. However, the BibBase page does not show these changes (where it should add the URL and the abstract to the mentioned publication). I am not clear what is the issue here?

For the changes I made on Dec 16 (which is change 2 in the mentioned changes to the bibtex file in this thread), they are still not updated on the downloaded version of the file using the link. Therefore, it seems like my hosting provider is still distributing the older cached version of the file as you suggested.

Thanks,

Dhaivat

What I've shown you above is verbatim what the bibbase server receives when requesting your bibtex file. I know it is confusing that you and I when we request that seemingly same URL get a different response, but that's how CDNs work: they actually consist of very many servers distributed around the globe such that users can always get the pages they request as fast as possible, namely by being served by the closest such server. The bibbase server is located in Oregon and may hence be receiving the file from a different server in the CDN of your hosting provider.

Here is a screenshot of me making the exact same request from my machine (in California) and from the bibbase server in Oregon. As you can see the results are different.


 

So unfortunately there is nothing I can do here until your hosting provider refreshes their CDN cache serving that region in Oregon. I've already tried tricking your provider into refreshing the file by adding extra URL parameters (?something=1). This sometimes works because it's a different URL. But in this case it didn't.

That makes a lot more sense. Thank you very much for helping me with this. I am trying to see if the hosting provider can resolve this issue now otherwise we will look for other options.