Python Input Methods for Competitive Programming:

Python is a language that is known to be user-friendly, but it has a flaw. Python is quite slower. If we had to compare on the online platform, if C/C++ time provided was X, Java will be 2X, and in Python for data science will be 5X.
Various input/output procedures are present to improve the speed of code execution.
Suppose we need to find the sum of N numbers entered by the user.
Input any number N
Enter N numbers separated by single whitespace in a line.
1 2 3 4

Python’s Solutions for Above Problem:

Normal Method Python: (Python 2.7)
1. raw_input()- accepts an optional prompt argument. Apart from this, it strips the trailing newline character from the string it returns.
2. print- It is nothing but a thin wrapper that formats the inputs (space between arguments and newline at the end) and then calls the write function of an object.
Input Methods for Competitive Programming in Python for Data Science

Using stdin, stdout for a bit faster method: (Python 2.7)

1. sys.stdin- It is a File Object. It is like creating a file object to read input from the file. The file will be a standard input buffer in this case.
2. Stdout.write(‘D/n’)- It is faster than print ‘D’.
3. Much more faster is to write all once by stdout.write(“”.join(list-comprehension)). But this has a flaw; it makes memory usage dependent on the size of the input.

Input Methods for Competitive Programming in Python for Data Science

Adding a buffered pipe io: (Python 2.7)

1. Simply add the buffered IO code before submission code to make output faster.
2. io.BytesIO objects have a benefit; they implement a common-ish interface (called file-like object). BytesIO houses an internal pointer and for every call to read(n) the pointer advances.
3. The atexit module, which is present provides a simple interface to register functions to be called when we close a program usually without any anomalies. The sys module provides a hook, sys.exitfunc, but here only one function can be registered. The atexit registry, on the other hand, can be used by multiple modules and libraries at a time.

While working with a huge amount of data in python for data science, the normal method will not be able to perform within the given time. Method 2 mentioned above helps in handling a large amount of I/O data. While on the other hand, method 3 is the fastest. Method 2 and 3 are usually used when the input data files are greater than 2 or 3 MB.
Note: Above mentioned codes are in Python 2.7, to use them in Python 3.X versions simply replace the raw_input() with Python 3.X’s input() syntax.

To learn more about it, you can check out this and this.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.