top of page
Search

Give life to your Data Science Apps using Streamlit


"Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from many structural and unstructured data." - This is the wikipedia definition of Data Science, so what does it simply mean is that we, generate an enormous amount of data in our day to day life and using data science this data can be structured, processed and many useful insights and information can be obtained benefiting individuals.


How can it benefit us? Consider you log on to youtube, to binge watch your favorite videos, but when you open imagine if you get totally irrelevant videos which you have never watched or not prefer to watch we may be irritated. Instead once you open youtube all the videos you are about to search pops up and what ever videos are in your mind just appear on the screen, you will get a feel that world is so understanding, and your time on searching your kind of videos is saved. Data Science helps us this way it reduces the time spent on repetitive tasks by predictive analysis and automating them.

Data Science is a vast domain consisting of subdomains like:

  • Machine Learning

  • Artificial Intelligence

  • Natural Language Processing

  • Image Processing

  • Audio Modelling

  • Feature Engineering

  • Data visualization

  • Data Analytics

  • Deep Learning

and more...


Now coming to a developer point of view, we, Data Science enthusiasts love to explore these areas by practical implementation of our coding knowledge with available data sets in kaggle, github or other data resources.

Usually working with real world data science problems can be done in python, as python has excellent set of libraries and packages inbuilt and the packages supporting data science are ever increasingly added. Despite python being a cool language to code in backend, its not much UI friendly language.


This is why data science enthusiasts have no option to exhibit their wonders done with data. Though through Github we can share our repositories it not much appealing and most of them wouldn't try it.

Imagine you cook delicious food, but you couldn't serve it in an appealing way and there's no one to comment or appreciate, sad part!!!

But the sad story ends now with the use of 'Streamlit'

Streamlit is an open-source Python library that makes it easy to build beautiful custom web-apps for machine learning and data science. It provides essential widgets required in the application for uploading the data parameters into the Data Science Code. Not only that, the whole application with its necessary packages can be modeled to a web-app and run using streamlit. Want to know how? Follow through the reading...

First install streamlit in your jupyter-lab terminal or windows powershell or command prompt using the comand,


pip install streamlit

Install all the necessary libraries and packages for your application. Then write your code in a python script ( .py file), import streamlit in your code.

If you need any kind of headings like

you can add it by,


 st.title("WELCOME TO THE WORLD OF MUSIC!")

If you need to add an image to make it look good or an explanatory video like this.

by adding these lines,


vid=open("example.mp4","rb")
st.video(vid)
st.markdown("<span style=“background-color:#121922”>",unsafe_allow_html=True)

where example.mp4 is the pre-stored file in your working directory

To upload audio or video or any type of files into the program for predicting results (simulation) something like this

add these lines to your code

file_to_be_uploaded = st.file_uploader("Choose an audio...", type="wav")

for audio type can be wav or mp3

for video type can be mp4

for image type can be jpg, jpeg, png

for files type can be pdf, pptx, docx etc

Also you can customize like instead of uploading an audio, we can also record a file and send as input to our program by first inserting a button to the page as,

using this code

if st.button("Start Recording"):
	with st.spinner("Recording..."):
		record(duration)

The record(duration) function handles the process of recording the audio file and saving it,

def record(duration):
    filename = "recorded.wav"
    chunk = 1024
    FORMAT = pyaudio.paInt16
    channels = 1
    sample_rate = 44100
    record_seconds = duration
    p = pyaudio.PyAudio()
    stream = p.open(format=FORMAT,
    channels=channels,
    rate=sample_rate,
    input=True,
    output=True,
    frames_per_buffer=chunk)
    frames = []
    for i in range(int(44100 / chunk * record_seconds)):
        data = stream.read(chunk)
        frames.append(data)
        stream.stop_stream()
        stream.close()
        p.terminate()
        wf = wave.open(filename, "wb")
        wf.setnchannels(channels)
        wf.setsampwidth(p.get_sample_size(FORMAT))
        wf.setframerate(sample_rate)
        wf.writeframes(b"".join(frames))
        wf.close() 
        audio="recorded.wav"

now we can use the 'recorded.wav' file in your succeeding code lines

The duration parameter sent to to record() function can be set by a slider using,

using the code


st.sidebar.title("Duration")
duration = st.sidebar.slider("Recording duration", 0.0, 3600.0, 3.0) 

This will create a sidebar in that a slider can be adjusted as required

now to execute the program by a simple command:

streamlit run project.py 

(consider the name of your python script source code is named as 'project')


Now on entering this command, you will see the following screen, wait till the code is executed this may take some time depending upon your code size and the libraries included.

Finally on perfect execution you can see your data science web app running with the desired UI and widgets in your local host with a url something like this


Also if you want to modify and re-execute the code you can modify your python script and click rerun

One thing to be noted in streamlit is that it doesn't use print statement to print any output lines instead it uses 'st.write()' if you use print statement in your code then those lines are written to the console.


Now to show case your app with a public url you can use heroku or aws platforms

In heroku you have to upload your files in github repository and connect that repo with your heroku app. Apart from your python script you may need to add a requirements.txt file, Procfile, setup file and add it your github repository

Requirements.txt file can be generated automatically by simply installing pipreqs

pip install pipreqs

and add your working directory to below specified path like

pipreqs /<your_project_path>/

eg.

pipreqs /c:/users/home/proj_directory/

inside the folder proj_directory you should have your python script for which requirements.txt file should be created.

Procfile should be created with the following contents in it,

web: sh setup.sh && streamlit run proj1.py

Procfile should not have any extension else heroku may meet with some errors

setup.sh file contents are


mkdir -p ~/.streamlit/ 
echo "\
[general]\n\
email = \"example@gmail.com\"\n\
" > ~/.streamlit/credentials.toml
echo "\
[server]\n\
headless = true\n\
enableCORS=false\n\
port = $PORT\n\
" > ~/.streamlit/config.toml

for reference check out my Github repository


Yeah!!! By this you can give life to data science apps and unlock them from your jupyterlab or jupyter notebook

Checkout my other repositories and do drop a star if you like it!!!



Check my other posts and profile here, for any comments, do write to me.


Comments


Post: Blog2_Post
bottom of page