I’ve got a scraping script in python to retrieve automatically french news le francais facile each day and convert the scraped html markup to pdf using pdfkit.
Since I run a website updating french news everyday it would be nice to have the whole process automated and preferably in Python.
So the content on this website https://dailyfrench.fr/ is generated and updated automatically each time there is a new RFI news (scroll down the website to see the news and files already scraped)
The handy package ftplib comes to aid.
from ftplib import FTP
ftp = FTP(‘host’)
ftp.login(user=’username’, passwd=’password’)# set working directory on FTP server
ftp.cwd(‘ftpPath’)
headerPath = “header.html”# use STOR + filePath to save the file on FTP
# for pdf, audio etc using rb as opening mode
with open(headerPath, ‘rb’) as header:ftp.storbinary(‘STOR ‘ + headerPath, header)