Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
data:retrieval [2024/04/10 13:58] – Removed all the text about the data format Richard Bowersdata:retrieval [2026/06/18 10:09] (current) – [External Sequencing Service Users] Johanna Barbieri
Line 5: Line 5:
 ==== CRUK-CI Researchers ==== ==== CRUK-CI Researchers ====
  
-We provide a tool for downloading files for projects, libraries and runs that you can use from the command line. The full user manual for the download tool can be found [[https://internal-bioinformatics.cruk.cam.ac.uk/docs/clarity/internalapi/downloadtool.html|on this internal web page]]. Please visit the page to download the tool, find instructions for how to use it and also how to install Java on your personal machine (link requires you to be in the building or running the VPN).+We provide a tool for downloading files for projects, libraries and runs that you can use from the command line. The full user manual for the download tool can be found [[https://internal-bioinformatics.cruk.cam.ac.uk/docs/clarity/downloadtool/userguide.html|on this internal web page]]. Please visit the page to download the tool, find instructions for how to use it and also how to install Java on your personal machine (link requires you to be in the building or running the VPN).
  
 ==== External Sequencing Service Users ==== ==== External Sequencing Service Users ====
Line 11: Line 11:
 Outside of CRUK-CI, data is delivered through the FTP site: ''ftp1.cruk.cam.ac.uk''. Outside of CRUK-CI, data is delivered through the FTP site: ''ftp1.cruk.cam.ac.uk''.
  
-This is a vanilla FTP site that should be accessible by any FTP client you choose. Your group will have been provided a user name and password for accessing the site when your group arranged to use the CRUK-CI sequencing service. Your data will be in a private region of the server only accessible with your group's credentials. The site is read only.+This is an FTP site running the FTP protocol with TLS encryption (sometimes known as ''FTPS''). You should be able to connect to the site using any up to date FTP client you choose. Your group will have been provided a user name and password for accessing the site when your group arranged to use the CRUK-CI sequencing service. Your data will be in a private region of the server only accessible with your group's credentials. The site is read only.
  
 **Files are available on the FTP site for a guaranteed thirty (30) days after sequencing. You MUST fetch the files in this time period. The files will be removed from the FTP site once this time has elapsed.** **Files are available on the FTP site for a guaranteed thirty (30) days after sequencing. You MUST fetch the files in this time period. The files will be removed from the FTP site once this time has elapsed.**
Line 29: Line 29:
 === FTP Clients === === FTP Clients ===
  
- There are many FTP clients available on the web one can use to fetch files from our FTP site. Some clients are:+There are many FTP clients available on the web one can use to fetch files from our FTP site. We officially support two of them:
  
   * [[https://filezilla-project.org|Filezilla]] (desktop)   * [[https://filezilla-project.org|Filezilla]] (desktop)
 +  * [[https://lftp.yar.ru|LFTP]] (command line)
 +
 +There are others you can use, though we have not properly tested them:
 +
   * [[https://cyberduck.io|Cyberduck]] (desktop)   * [[https://cyberduck.io|Cyberduck]] (desktop)
   * [[https://www.coffeecup.com/free-ftp|CoffeeCup Free FTP]] (desktop)   * [[https://www.coffeecup.com/free-ftp|CoffeeCup Free FTP]] (desktop)
-  * [[https://lftp.yar.ru|LFTP]] (command line) 
   * [[https://www.ncftp.com/ncftp|NcFTP]] (command line)   * [[https://www.ncftp.com/ncftp|NcFTP]] (command line)
  
-On Linux, you might find that these programs are available through the platform's package management system. Most web browsers will also allow you to navigate the FTP site if you use the FTP protocol in the address bar: [[ftp://ftp1.cruk.cam.ac.uk/]]. The browser will prompt for your user name and password. The web browser is handy for having a look at your area of the server and previewing the reports but a proper FTP program is recommended for fetching the files.+On Linux, you might find that these programs are available through the platform's package management system.
  
 We have become aware of some users using the Mac's //Finder// application to connect to the FTP server and copy the files. While convenient, it appears that //Finder// can silently truncate files while copying if the connection to the FTP server drops. Thus we do not recommend using //Finder// or //Windows Explorer// to copy the files: use a proper FTP program that will report errors. Above all, and regardless of the program used, **you must check your files against the checksums after downloading** as described above. We have become aware of some users using the Mac's //Finder// application to connect to the FTP server and copy the files. While convenient, it appears that //Finder// can silently truncate files while copying if the connection to the FTP server drops. Thus we do not recommend using //Finder// or //Windows Explorer// to copy the files: use a proper FTP program that will report errors. Above all, and regardless of the program used, **you must check your files against the checksums after downloading** as described above.
Line 43: Line 46:
 === Troubleshooting === === Troubleshooting ===
  
-Our FTP server is pretty much as vanilla as they come and should not cause any problems. Nonethelessoccasionally people do tell us they cannot connect, and so far this has always been problems at the client end. Here are some things to check.+Most FTP clients, and certainly //lftp// and //FileZilla//handle the TLS encryption automatically. Other clients may need you to specifically tell it to use an encrypted connection. 
 + 
 +Occasionally people tell us they cannot connect, and so far this has always been problems at the client end. Here are some things to check.
  
-  - Make sure any encryption option is turned offSome clients have encryption, sometimes called SSL or FTPSturned on by default.+  - ''FTPS'' is not the same as ''SFTP''The former is the FTP protocol with encryption, the latter is file transfer over the secure shell protocol. You cannot use ''sftp'' or ''scp'' with our FTP server. 
 +  - Make sure the option for ''FTPS'' or ''SSL'' encryption is turned on in your client if it has explicit options for this.
   - Make sure your computer can see our server. Try "pinging" the FTP server with "''ping ftp1.cruk.cam.ac.uk''" from the command line. The server will echo back a reply if your pings are getting through. If ping says the packets are not being returned, please check with **your** IT department to check network connectivity. The problems have never yet been at the CRUK-CI end; if our FTP site does need to be taken out of commission for a while we will let all our collaborators know beforehand.   - Make sure your computer can see our server. Try "pinging" the FTP server with "''ping ftp1.cruk.cam.ac.uk''" from the command line. The server will echo back a reply if your pings are getting through. If ping says the packets are not being returned, please check with **your** IT department to check network connectivity. The problems have never yet been at the CRUK-CI end; if our FTP site does need to be taken out of commission for a while we will let all our collaborators know beforehand.
 +  - If you are able to log in but get errors when trying to download, please ask **your** IT team to check they are allowing access for FTP ephemeral ports. The range we use is 41698-41707.
 +  - If you still have problems after checking all of the above, please contact the CRUK-CI IT team on ''[[mailto:ithelpdesk@cruk.cam.ac.uk|ithelpdesk@cruk.cam.ac.uk]]'' and cc Genomics ''[[mailto:Core-Genomics-Staff@cruk.cam.ac.uk|Core-Genomics-Staff@cruk.cam.ac.uk]]''. Note this is a different address to the usual Genomics help desk and should only be used for queries about connectivity problems to our FTP server; all other queries need to go to the Genomics help desk as normal.