App for Cloudflare® Pro

App for Cloudflare® Pro 1.8.1

Move files into R2

Any chance you have ideas to make this migration faster? I'm using it on a client who has shared hosting, so no SSH access. I tried copying 50 at a time, and I think I probably hit some sort of rate limit with the host and the migration got killed. 10 seems to work, but its been running for three hours and has only transferred about 1,300 out of 6,400.

I'm guessing its not easily possible or you would have done it - but figured I'd ask.

edit: More importantly, this client has larger sites. My concern is that I don't want to put redirects in place until the R2 migration is complete. But if the migration takes days, the site will be broken until complete.
 
There's a couple things going on here... A single WordPress media object is internally a bundle of many files (mostly a bunch of different sizes that WordPress generates), so even though it says it's transferred 1,300 media, really it's quite a bit more than that (if it'a s new bucket, you can look at how many objects are in the bucket... depending on the media type and sizes, it may be closer to 10,000 underlying media that have been transferred).

Technically there could be some migration process that's multi-threaded and doing things in parallel, but to do it *right* you really need to be able to do it via shell access (SSH), so that wouldn't be particularly helpful here anyway. Trying to do it in parallel via web interface could be done, but you really are trading stability and doing it 100% right to make what should be a one-time process faster. By design HTTP requests are "stateless", meaning you can't really guarantee that they run at all, run successfully, run only once, etc. So firing a bunch of parallel HTTP requests and just hoping it worked right is just asking for problems. Since we are bound to the web interface, it's really going to be better to do it as the single-threaded process to make sure we get feedback on each and internally retry if one failed.

I kind of touched on it earlier in the thread over here. I don't particularly love that it's not faster, but for sake of consistency and reliability it's probably just going to be how it has to be (again, it's a one-time process and the system has the ability to have a mixture of some media in local storage and some in R2... so it's not something where speed is an absolute necessity where things are broken until it's finished). And my first idea of doing it with WordPress's cron system ended up not being too good either because some WordPress installations disable cron so it's not even available to some installs.

Someday I may build a multi-threaded shell-based (SSH) migration tool, but in your scenario where you are limited to it being HTTP requests, there really isn't going to be a particularly great (reliable) way to do them in parallel.
 
  • Like
Reactions: fly
Would it be possible to do a background migration, then flip the switch to use R2? That way I wouldn't really care how long it took, and the change (and then rewrite rules) could be somewhat more instant.

Also, does it remove the local files? If not, is there an easy way to do that?
 
Would it be possible to do a background migration, then flip the switch to use R2? That way I wouldn't really care how long it took, and the change (and then rewrite rules) could be somewhat more instant.
You could use rclone to move stuff quickly (in parallel), but again it's something you would need CLI/SSH access to use.

If you look in the Cloudflare's R2 dashboard and look in the bucket, you can see the file structure... it's really just the contents of the wp-content/uploads/ folder that is moved to R2. For example an image like so: wp-content/uploads/2023/01/some-fancy-image.jpg lands in the R2 bucket with an object key of 2023/01/some-fancy-image.jpg.

So if someone were to use rclone to move the files, it's infinitely less time for the plugin to simply tag media as being in R2 in it's local database.

Long story short, is that you could use rclone to move the files, then go into the plugins/app-for-cf-pro/src/DigitalPoint\Cloudflare\Base\PubAvanced.php file and comment this one line out so you can do all the other stuff minus the actual moving of the file to R2:

PHP:
// $cloudflareRepo->addR2Object($bucketName, $sitePrefix . $filename, file_get_contents($filePath));

Also, does it remove the local files? If not, is there an easy way to do that?
It does automatically remove the local files as each is migrated.

Without shell/CLI/SSH access though, any solution that runs things in parallel is not going to be reliable (might just have missing media files that never made it to R2 for some unknown reason since it would be a fire and forget system).
 
Okay, I was able to get a backup of the site locally, then rclone everything into R2. Then I edited the file so that the actual copy doesn't happen. Yay

Now I have 12 files seeming 'stuck' in the local filesystem. It just loops over and over, and never goes to zero. Thoughts?
 
I didn't see anything in the code that's blatantly obvious, but the system is reliant on WordPress properly updating the media's meta data that WordPress keeps (if it doesn't get updated to flag that it's already in R2 for whatever reason) it will just think it's still not there.

One thing I could think of might be if the meta data record for those media items are just missing completely somehow or if you have a plugin that is intercepting media meta writes and deciding to sometimes not do them for whatever reason.

That being said, if you go into the app-for-cf-pro/src/DigitalPoint/Cloudflare/Admin/Template/R2MigrateBatch.php file and add this:
PHP:
$this->params['media'];

after this:
PHP:
$this->addAsset('css');

It won't particularly be pretty, but it will allow you to at least see the "post" IDs for the media in question (don't ask my why WordPress classifies media as posts internally).

From there, you can jump to those media items by going to the URL (where xxxx is one of the IDs):

wp-admin/upload.php?item=xxxx

Maybe you can see if there's something noticeably unique about those media items that I'm not thinking of. If the attachment meta record is intact, you should see the info like dimensions (if it's an image), when it was uploaded, etc. if none of that is there that may be the problem (where WordPress doesn't have the meta record for the media attachment for whatever reason).
 
So it looks like the pictures are there in the local filesystem (not R2). But when I try to view them in the media gallery, they come up blank.

Oh shit, actually I see the variations, but I think the main image is gone. It isn't in R2 either.

edit: Also, I think you said that the local files get removed after migration into R2. I don't see that happening. So I guess the good news is that it didn't delete anything. But I eventually would like it to delete the local files.
 
Last edited:
Ya, it sounds like those media items are mucked in WordPress to begin with. Like maybe someone deleted them at some point but WordPress failed at fully deleting them. Like the original file was deleted, but not it's variations and I *suspect* the meta record for them was deleted (which would cause the loop you are seeing).

If you go into the app-for-cf-pro/src/DigitalPoint/Cloudflare/Helper/WordPressPro.php file and the 3 places that have LEFT JOIN, change those to INNER JOIN, that would leave them out of it's migration process if the meta record is missing. The plugin isn't going to be able to repair a mucked media item, but it can at least ignore it so it doesn't get stuck in a loop.

I don't have a WordPress install with broken media in order to test that, so let me know if that works.
 
According to my client, it seems like they were actually fine before this. There are now quite a few broken images.
 
Back
Top