+2

Ability to delete all / multiple files at once

Adam Rego Johnson 4 years ago updated 3 years ago 10

Hi, would there be any way to implement a feature to delete all files in the file manager at once / select multiple files at once to be deleted? I am dealing with a large amount of PDF files, and I need to change their automatic naming convention when exporting from Zotero for various logistical reasons. These files are hosted on BibBase with a .bib file pointing to them. Being able to 'select all' files, and then deselect my .bib file on the BibBase file manager, in order to delete all PDF files at once, would be a major time saver rather than having to manually select 'delete' and 'yes' for 600 files. I don't currently anticipate needing to do something like this multiple times, but these things can happen, of course, so I figure this might be a good feature to have in general if it is not too hard to implement.

Thank you!

Adam

Completed

Agreed, this is a useful feature!  We've just added two new features to the file manager:

  • a filter, and
  • a "delete all" button

The button only deletes the files currently included by the filter. So, e.g., if you want to delete all .pdf files you could filter by "pdf" (or "pdf$" to be sure that only files ending in pdf will be included), and then hit delete all.

I hope this addresses your need. Please let me know how it goes, especially if you run into any issues using this.

Thanks for the suggestion.

Hi Christian, I used the feature today and had no issues filtering by PDF. The ability to see how many files are in the system by pressing "delete all" and not executing the function is very useful as well.


I tried to filter by ".bib" just to check and that didn't seem to work fully, as it was pulling, in addition to my two .bib files, a few articles that presumably had those letters but not in that order. Encapsulating ".bib" in quotations didn't return any results, so I presume further operators like that aren't implemented.

Thank you for such a quick implementation of this feature, this is very helpful.

Glad it worked!  For finding .bib files, try  bib$. The filter, again, uses regular expressions and the dollar sign is a special character in regular expressions indicating the end of the string.

Hi Christian, this feature has become a regular part of my workflow and has been very useful, but I've encountered some issues in the past 24hrs using this feature. For context, when I have updates in my library I periodically need to export to BibBase, I have taken to simply deleting all files in my account and then reuploading all .bib and pdf files. This seems easiest for me given the size of the library and the tiny changes that are hard to keep track of, e.g. in filenames or changes in a file that have kept the same name, etc. I upload in batches of 100-200 files in order to avoid hitting the servers too hard as recommended in a prior post of mine.

I seem to have encountered a similar slowdown issue now, though, and when I select to delete all the (~800) files are now deleting very slowly, or not at all. I encountered this around 7-9PM EST last night, the files were deleting one by one very slowly. I repeated the delete all request a couple times, then tried once more from another browser. I exited the pages in both browsers to see if it would stop. I believe I reopened the page 1-2 times and it was still slowly deleting when I came back to it. When I checked back within 30 minutes, it had stopped after 100 files or so were deleted, and it hasn't deleted any more since. I manually deleted my two .bib files to replace them and that seemed to work ok. Just now, I've gone on my other machine and tried to delete a smaller section of files (all the .xlsx in my library, I think 22), and the behavior has been repeated, deleting very slowly 9 files before now stopping after I closed and checked back on it 20 minutes later.

Hopefully I have not caused another issue elsewhere on BibBase servers. Is there anything to be done about this? While I am waiting an updated .bib is in place, but ~100 of the files some references point to are of course unavailable. I thought about setting it to delete again and seeing if it would slowly get through them so that I could eventually complete my full replacement of all the files, but I don't want to potentially make any issues worse.

Under review

Hi Adam,

I think we really need to find a way to avoid this workflow. Deleting and then re-uploading over a gigabyte each time there is a change is not good for a number of reasons, not least of which being what you saw, namely that it will slow things down. I believe this is not really a bug, but rather an effect of the pricing structure of our cloud provider (AWS): reads and writes can burst (go really fast) but only for so long. Once that burst budget has been used up things can get very slow, and I think that's what's happening here.

Would it help if we added a de-duplication feature, where, when dropping files that already exist in the account, there is a popup asking whether ti skip or replace them? If I understand correctly, the only reason you are reuploading everything is because it's hard for you to know which files were added and deleted, right? This would not handle the deleted ones, but would give us a viable way to add new ones without needing to reupload those that are unchanged.

Would that work

Hi Christian,

Sorry to hear I have been inadvertently causing issues, I didn't think it would take that much bandwidth. I agree it is not an ideal workflow, but pursued it as it saved me quite a bit of time compared to the alternative.

I didn't really specify correctly as I didn't think it would matter, but the issue is more than just de-duplicating. I could without too much issue find new files that have been attached to a new reference in my library. The problem comes from when references' existing files are changed -- either (1) because they have been replaced and then given the same filename (all attachments are automatically renamed to their reference's BibTeX citation key through Better BibTeX and Zotfile), or (2) because the reference itself has changed its information such that its BibTeX citation key and thus its file's filename has changed, whether or not the file's content has changed -- or when references' existing files should be deleted as the reference doesn't exist anymore.

When I realized what a nightmare this made for periodically updating the filebase in BibBase, I came to the following solution:

  1. Always export all references' files in the library to a specific folder
  2. The next time you export, export the files first to a new folder.
  3. Compare the two with the program Total Commander, which has a "synchronize directories" option, where you can review the files in two folders to highlight:
    1. Which files match
    2. Which files don't (e.g., the old export has a file that is not in the new export, likely because its reference has been deleted, or the new export has a file that is not in the old export at all, likely because a reference has been added)
    3. Which files match in filename but not content (the actual data of the file), or vice versa (so a file that has been replaced but has the same name, or a file that just now has a new name).

I then have to manually go through these files and confirm what has to be added, what has to be deleted, and what has to be replaced, and one by one add these files to BibBase and delete and/or replace any of the existing files on BibBase. If not for having to upload to the website, I should be able to simply use the program's "synchronize" options -- but I instead need to select out which files have been added deleted or changed, and port this over into the BibBase folder. This was of course tedious even for me and time consuming, and prone to small errors going totally unnoticed for long periods of time. It was thus much quicker and more secure for me to simply upload and replace everything at once, so I always knew that the the .bib had the exact matching files necessary uploaded. But I see that that is not tenable for you.

A de-duplication feature would help for some but not all of this process with Total Commander, as it would allow me to quickly grab any files with the same filename but different content and chuck them in and allow them to replace the old files (this is not currently possible I believe, it would just add another copy of the file with no differentiation). Outside of that, if you have any suggestions I'm open to hearing them, I could be totally missing something, but otherwise I'll have to go back to this workflow.


I'm assuming there is not a de-duplication feature you could implement that is able to detect files with the same filename but different content, and give me the option to only replace those specifically. That would take much of the complexity out. But I'm not sure that's a thing haha.

Hi Christian,

I have been waiting to see if the burst budget would be replenished at any point so I can use it a final time to delete and reupload all my files so I can start "fresh" with the method described above, but every time I have attempted to "delete all" every week one file has been deleted before slowing to a crawl. I currently have most but not all of my files (am missing about 100) due to the initial delete all attempts described in my original post, and at this point I need a fresh wipe so I can be certain what is actually on the server -- by keeping a verified copy on my own machine of all the files as uploaded on x date, so that going forward I can compare and contrast the files to know exactly which ones I need to upload or replace, as described. Is there any way to delete all the files currently on my account (gcwealthproject)? (Aside from the two .bib files I have.)


I am fine to upload all my library files slowly over a few days if needed, and to avoid the batch deletion and upload process as requested in favor of a slower alternative, but can't let the library sit in this state much longer so need to figure something out. Apologies for the hassle, and thanks for all the help.

Hi Adam,

I'll try and manually delete them. I'll report back here.

Christian

OK, it worked. I've deleted all files except the two bib-files.

Thank you, much appreciated.